CS 4960-01 Homework 5

Due: Wednesday, October 1st, 2008 10:45am

The file mb.c implements a simple Mandelbrot set generator. It takes a number N on the command line and writes an image in PBM format to stdout. Redirect the output to a file and then you can use image converters/viewers to see the output.

Handin six files: mb_block_server.c, mb_block_client.c, mb_step_server.c, mb_step_client.c, mb_work_server.c, and mb_work_client.c.

Part 1

Parallelize the program using the message-passing library bmsg.c that is part of reducescan.zip. It's like the library used in HW2, except that it works in terms of bytes instead of integers. Your parallel version should work with an arbitrary t ``server'' processes, and it should partition the work among processes by giving each process N/t consecutive lines to compute.

Name the files to implement this part mb_block_server.c and mb_block_client.c.

You should get pretty good speedup for 2 processes, but less improvement with 4 processes. The problem is that some lines require more computation than others, and the ones that require more computation tend to be close together.

To make this clear, use the function in ptime.h in each server to print the processor time that it consumes, probably something like this:

int main(int argc, char *argv) {
  ....
  printf("%s done in %ld msec", argv[1], get_process_time());
  return 0;
}

With 2 processes, the reported times will be about the same. With 4 processes, 2 of them will take much longer than the others.

Part 2

Adjust your parallel version of the program to interleave the lines assigned to different processes. That is, the first process should compute lines 0, t, 2*t, etc.

Name the files to implement this part mb_step_server.c and mb_step_client.c.

With this data distribution, you should get better speedup for 4 processes and more. If you make the servers report ther process times as above, you shold see very similar times reported for all processes.

Part 3

Adjust your parallel version of the program to implement a work queue instead of dividing lines up among the processes in advance. When a server is ready to compute, it should receive a line number from the client. After computing just that line, it should report the result and get another line number to compute. The client can use -1 as a line number to indicate that the server should exit.

After dispatching an initial set of lines, the client cannot predict which server will repond first with an answer. The bmsg.c library provides the following function that the client can use to wait until one server has responded:

Name the files to implement this part mb_work_server.c and mb_work_client.c.

With a work queue, the data will not be distributed any better than with interlaved line assignments. That is, the processor times reported by the servers are actually likely to be more different than with interleaved assignments. Neverthless, there's a good chance that you'll get better overall performance when using many (equivalent) processors in a cluster. (Why?)


Last update: Monday, December 1st, 2008
mflatt@cs.utah.edu