CS 4960-01 Homework 4
Due: Monday, September 29th, 2008 10:45am
For this homework, start with the reduce/scan framework in reducescan.zip. Hand in a new archive (.zip, .tar, or .tgz) with additional subdirectories for programs that use the framework, and with changes to the framework for parts 2 and 3 below.
Part 1
Use the reduce or scan framework to implement the following programs. (Except for the last one, the parallel version will not actually be faster, because the computation is too cheap relative to the communication cost.)
- Count the odd numbers in an array of integers. The implementation should be in a count_odd sub-directory.
- Given an array of 2-D points, compute the area of the smallest rectangle (with edges parallel to the X and Y axes) that encloses all the points. Implement this program in an area sub-directory.
- Similar to the previous problem, but instead of producing a single area, compute for each point X the difference between (1) a bounding rectangle's area for all points in the input up to the up to X and (2) the bounding rectangle's area for preceding points in the input not including X. For example, given (1, 1), (2, 2), (3, 4), (2, 3), the result should be 0, 1, 5, 0. Implement this program in an area_delta sub-directory.
- Parallelize, yet again, the program from HW1 where deconvert calls convert. To obtain a speed-up, the accum function will need to store each converted value so that it can be used by scanGen (instead of having scanGen call convert a second time for each input value); to do that, accum can write to the process's local copy of the input array. Verify that this parallel version of the program provides a speed-up compared to the original, sequential version. Implement the program in a convert sub-directory.
Part 2
In several programs that might use the scan framework, such as the dist_sum example included with the framework or the last program for part 1 above, the input data needs to be converted in some way to obtain an intermediate value that is used by both accum and scanGen. In the case of last program from part 1, the intermediate data has the same type as the input data, so the input array could be re-used. More generally, the intermediate data has a different type and may need more or different storage.
Extend the scan framework so that a program using the framework supplies an prepare_temps function that take an integer for the size of the local input data array. Then:
- Adjust your program in your convertsub-directory to use prepare_temps to create a separate array for intermediate values. Change accum so that it never writes into the local data array, and instead writes into a separate temporary array.
- Similarly, change the scan-based program in the dist_sum directory to use a temporary array instead of performing the sqrt computations twice for each input element.
- Fix up other uses of the scan framework so they run with the extended framework.
Mark your changes to the scan framework code clearly with comments.
Part 3
Adjust the reduce and scan framework implementations so that they work with any number of servers, including numbers that aren't a power of 2. No use of the framework should have to change.
Hint: Each server could round up to N, the nearest power of two, and pretend that N servers are running – except that no information is ever sent to the non-existent servers, and the non-existent servers always produce ``zero'' tallies.
Last update: Monday, December 1st, 2008mflatt@cs.utah.edu |