## Cloud Computing, Final Exam for the course CS 4365/CS 5315, Fall 2011

Name ___________________________________________________

10 pages of notes allowed.

1-2. Briefly explain the differences and similarities between grid computing, cloud computing, autonomic computing, green computing, and jungle computing.

3-4. Explain advantages and disadvantages of cloud computing. Explain differences and similarities between public cloud, private cloud, and community cloud. Why do we sometimes need to go beyond cloud computing?

5. If the original size of the cloud is 100 units, and in the quadratic growth model

x(t+1) - x(t) = a + b * x(t) - c * (x(t))2,
the constant term a is 1.0, the coefficient at the linear term is b = 10.0, and the coefficient at the quadratic term is c = 0.01, what is the predicted cloud size in the next year? In general, if we know the cloud sizes at different years, how can we predict the future cloud size? (just explain the main ideas, no need to derive detailed formulas).

6-7. In the RSA algorithm, let us start with prime numbers p = 3, q = 7, and let us take e = 5. Show, step by step, how the RSA algorithm will generate a secret code d, how it will encode the message m = 4, and how it will decode the resulting message back. Use the actual RSA algorithms, do not just raise to the power. Why is security especially important for cloud computing?

8. Why cloud computing encourages parallelization? Show how, given eight numbers a1, a2, ..., a8, we can compute the partial minima

m1 = a1, m2 = min(a1, a2), m3 = min(a1, a2, a3), ..., m8 = min(a1, a2, ... ,a8)
in parallel; use the general algorithm that is applicable for all sizes.

9. Show how, given numbers a1, a2, a3, a4, b1, b2, b3, and b4, we can compute, in parallel, the expression min(a1 + b1, a2 + b2, a3 + b3, a4 + b4).

10. Each professor is required to keep, for one year, the graded finals for each class taught by this professor. Since we are moving to the new building, to minimize the need to move paperwork, each professor scanned all their graded finals for each class into a single file. The Director of the Graduate Program wants to see each student's progress, so he wants to bring together all the records corresponding to each student and to compute the semester GPA of each student. Explain how to do it in MapReduce: what is the original key and the original value (and how your choice of keys helps to parallelize the problem) and what are intermediate keys and values (and how your choice of keys helps in solving the problem).

11. Let us consider two hypotheses: a hypothesis H1 that the move to the new building will be smooth and on time, and a hypothesis H2 that there will be some delays and disruptions. The prior probabilities of these hypotheses are P(H1) = 0.2 and P(H2) = 0.8. The event E is that the boxes with all the stuff to be moved were picked up by the movers ahead of schedule. We know that for movings in which everything went smoothly, the probability P(E | H1) that all the boxes were picked up ahead of schedule is 0.6. On the other hand, for movings with delays, the observed probability that all the boxes were picked up ahead of schedule is low: P(E | H2) = 0.1. If we know that the boxes were actually picked up ahead of schedule, what is the new probability P(H1 | E) that the whole move will be smooth and on time? Where are such computations used in cloud computing?

12-13. Explain the need for a different assignment of loads to servers than in situations when we aim for green computing and when we aim for the most efficient parallel computations. Illustrate the difference between two corresponding load assignment algorithms on the following example:

• we have 3 servers with 4 processors each,
• we receive a stream of tasks each of which requires one processor for 2 moments of time;

14-15. Why do we need clustering in cloud computing? Describe, step-by-step, how the following graph will be divided into clusters: 1-6, 1-2, 5-6, 3-8, 2-7, 8-9, 3-10. Where is the corresponding algorithm used in clustering cloud users?

16. Use a few first steps of bisection to find the square root of 6, i.e., the solution to the equation x2 = 6. Use [0,4] as the initial interval. Where is the corresponding algorithm used in clustering cloud users?

17. In cloud-related clustering, when do we mark some users as outliers? Illustrate the corresponding algorithm on the following example:

• we have 5 points,
• we want to select one of them as an outlier,
• these points have the following number of neighbors: N1 = 5, N2 = 3, N3 = 2, N4 = 1, and N5 = 4.

18-19. Describe, in detail, the paper that you reviewed as a project for this class:

• what problem is addressed in his paper,
• what solution is proposed for this problem, and
• (if applicable) what are the remaining open problems.

20. Briefly describe someone else's project for this class; two projects for extra credit.