Problem 1: Explicit Parallelism

After two straight all-nighters spent working on his Beta design, Ben Bitdiddle rushes off to his final ADU lecture. Only moments into his discussion the professorís lecture wonders off onto some tangent about how the Ferengi from Star Trek would be the ideal cache managers. Benís concentration suddenly to fails him. Within seconds Ben falls into a deep sleep.

Ben awakens to find that 20 years and, more importantly, 40 semesters have passed since he last made his way to class. As his head clears he realizes that he is still sitting in an ADU lecture. Even stranger, Ben comes to recognize the both the stale jokes and thinning curly locks of the esteemed lecturer¾ Professor (now Emeritus) Pratt.

Little has changed. The canonical programming example used is still a recursive factorial routine. However, the implementation is slightly different than Ben remembers.

int factorial(int n) {

return facthelp(1, n);


parallel int facthelp(int from, int to) {

int mid;

if (from >= to) return from;

mid = (from + to)/2;

return (facthelp(from, mid) * facthelp(mid+1, to));


A.     Ignoring the parallel keyword, how many times is the facthelp( ) function called for a given value of n? How does this compare to the version from twenty years earlier (shown below)?



int factorial(int n) {

if (n > 0)

return n*fact(n-1);


return 1;


Ben soon realizes that the new factorial routine is intended for an explicitly parallel machine. The professor explains that the parallel functions are assigned to processors and executed in parallel with the function that calls them. The caller function is considered the parent processor and the parallel functions are its children. When the parent requires the result of one or more children to make further progress it automatically stalls and waits.

B.    Ben remembers that the old version of the factorial function that he was taught ran in O(n) cycles. What is the execution-time complexity of the new parallel   factorial function?

C.    Is the factorial algorithm given suited for implementation on a SIMD processor? Explain.

D.    Is the programming language extension described best suited for implementation on a SIMD or MIMD parallelism? Explain.