Parallel Computer Systems, VT12
Assignment 1: FractSMP
In this assignment you will be using parallel computers
to compute fractals.
You must solve this assignment individually.
The assignment is divided into three parts.
Make a sequential implementation that generates Mandelbrot fractals.
The algorithm is described, besides on earlier years web page
on several web pages.
Input (given as command line arguments):
The output shall be a picture in ppm-format
- Xmax, Xmin, Ymax, Ymin
(pixels in x and y, should be given as 2 values, e.g., 1200x800)
- T (threshold) - maximum distance from origo in the complex plane
Make two parallelizations of the implementation above using
tutorial 2), respectively.
Added to the input is now:
Use some kind of dynamic load balancing (some automatic support in OpenMP)
This means that the distribution of the work should be dynamic when the
program discovers that some computations takes longer than others.
What you want is that every thread in the program uses its processor
100% (from the beginning) and that all threads finishes at the same time
(at least close to) even though it is unknown how long a computation
will take when you start the iteration.
Describe how your parallelizations are supposed to work. Is it
sensible parallelizations? Are there better ways to do it?
Are there performance differences between the two implementations?
Which did you find easier to do, why?
With the help of "magic box" you can speed the calculation
of certain areas up. Look at the picture to the right.
We want to compute the whole picture.
After the computation of the border points, we find that some of these
are different. The Picture is then split into four new pictures
(the blue lines). Each of the new pictures are then computed by
this recursive schema.
If a picture has a border where not all points have the same value,
split the picture, else if all border points have the same value we
assume that all points inside the picture have the same value as the
Look at the green squares.
For three of them we see that all border ponts have the same value and
we fill them with the same colour, the fourth we split.
There are some risks with magic box also. What if the value of all
border points of the whole picture had been the same?
Then we would not have seen the fractal at all!
Observe that if only one thread runs the 'magic box' part it is not
likely that you will fulfil the requirements for good load balancing
(especially in the beginning).
Add more input to the program:
- Use "magic box" or not
- Smallest box size (below this limit no recursion occurs,
just compute all individual values inside the box)
The parallelization may be done with either OpenMP or Pthreads.
Describe how your parallelization is supposed to work. Is it a
sensible parallelization? Is there a better way to do it?
You should use one node on Akka (1-8 cores) for your tests.
Then you are free to include tests from other machines as well.
The programs should allow an arbitrary number of threads, even though
you need only to use up to eight cores in the tests.
One simple way to measure the time is to use the function
gettimeofday() in your program (see man page).
In my_wtime.c is a function you
can use by calling it before and after the part you want to time
(and use the difference as the time spent).
For general information about running programs on the clusters at HPC2N,
please see the user guides and other web pages.
See also the lecture about assignment 1.
At least the Portland, Intel and gcc compilers at HPC2N (on Akka) have
support for OpenMP.
The option to
pgcc (the Portland compiler)
the option to
icc (the Intel compiler)
the option to
Do not forget to add the right modules, i.e.,
module add pgi
module add intel-compiler, respectively.
You can specify the number of threads the program should use
by setting an environment variable called
Here you find sample submitfile for OpenMP.
Remember to submit your jobs from the pfs file system.
The report should be well written and at least include:
- How you have solved the load balancing
- Performance graphs
- Speed-up graphs
- How do you expect you implementation to work with more threads?
- Limitations, problems
- Deadline: 13.00 on February 20, 2012