Homework assignments, CS 3335, Fall 2008

1. (Due September 2)

a) Take your birth year, convert to binary and to hex, then convert back to check that your conversion was correct.

b) Take you birth month and birth day, add and multiply them in binary and in hex.

2. (Due September 4) Take your birth day d and your birth month m, transform that into 2's complement (modulo 1024, to make sure you do not get overflow), compute d + (-m), m + (-d), d * (-m), m * (-d), and (-m) * (-d) in 2's complement, and check the results.

3. (Due September 9)

a) Use bitwise exclusive "or" to encode a byte sequence, then show how you can decode it.

b) On an example, check that the procedure that use exclusive "or" to swap two numbers works correctly.

4. (Due September 16; if you do it in by September 11, you get extra credit) Programs 2.38 - 2.41 and 2.43. For extra credit: do Program 2.42. The corresponding .c files must be sent by email to t.magoc@gmail.com.

5. (Due September 16) use the algorithm that we had in class to show how a computer will compute 1/5; for extra credit: use the same Newton's method to design an algorithm for computing the square root.

6. (Due September 18) Similar to the Newton's method that we described in class for computing the square root, design a method for computing the cubic root.

7. (Due September 23) For extra credit: Similar to what we did in the class, produce an example of how the optimizing compiler will assign registers to intermediate results when computing a complex arithmetic expression. Take into account that repeating expressions should be only computed once.

8. (Due September 30) Similar to what we did in the class, follow the standard parsing algorithm to produce an example of how a compiler will transform an arithmetic expression into an assembler code.

9. (Due October 7 and 9) Write a program (in Java or in C, whichever you prefer) that takes an arithmetic expression (given as a string) and transforms it into a pseudo-assembly code, with register assignments and line numbers. This code must be presented as a file, with one operation per line. Your program should consist of the following 4 stages:

Pre-fetch variables into registers before using them. Make sure that once a register is not needed, it is released.

Parts 1 and 2 are due on October 7; parts 3 and 4 are due on October 9.

Example: For the expression a + b * cc:

For extra credit:

10. (Due October 21) In the class, we saw how a computer will compute the sum and the product of all the elements in the array -- in C, in literal pseudo-assembly code, and in the optimized (non-parallel) pseudo-assembly code. We then saw, on this example, how we can parallelize this code by using the fact that the CPU operations size is the size of a double (floating point number) and thus twice the size of an integer, so we can have two integer operations in parallel even on a single-core machine. As a homework, show how the same can be done for computing the largest value in an array: Assume that the max(a,b) operation is hardware supported and do not worry about finding out which element is the largest -- just produce the value of largest element.

11. (Due October 23) Use the error-correcting code that we described in class to correct the 16-bit word which was originally 0xABCD but in which the last bit of B has been corrupted. Reminder: For error correction, we need a "control sum" bit that describes the parity of all the bits; in other words, this bit is:

We also need 4 extra bits:

12. (Due October 28) Similarly to what we did in the class, run an example of how multi-core computers can exploit the layered representation of data in memory, e.g., the last bits of all the words in one place, the next bits of all the words in another place, etc. In the class, we showed this on the example of adding two arrays c[i] = a[i] + b[i], i=0, ..., n, and for simplicity we assumed that each array contains four 4-bit integers. Then, we first compute all the last bits by applying the exclusive "or" operation to the last bits of numbers a[i] and b[i]; then, we find the carry by using bitwise "and". After that, we similarly deal with the previous bits, etc. This speeds up computation because now we only perform bitwise operations which are very fast - instead of addition which is slower.

As a homework, run another example of such addition.

13. (Due October 30) On an example, explain why the standard algorithm for multiplying two matrices leads to many cache misses, and how a block-based algorithm can help.

14. (Due November 6) Explain, on the example of a code for doubling all the elements of an array, how a different order of loops affects the number of cache misses. Consider the following two alternatives: a standard alternative

  for (i = 0; i < n; i++)
    for (j = 0; j < n; j++)
      {a[i,j] *= 2;}
and an alternative with the reverse order
  for (j = 0; j < n; i++)
    for (i = 0; i < n; j++)
      {a[i,j] *= 2;}
Trace all the cache misses occurring during the run of each of these pieces of code and compare the results on the following toy example:

15. (Due November 6) On the example of the standard alternative program from Problem 14, trace all the writes from the cache back to memory for two possible strategies: write-through and write-back.

16. (Due November 6) In the traditional way of storing data on a disk of radius R, we start the data recording at a distance r < R from the center, and on each circular track located between distances r and R, we store the exact same amount of information. In class, we derived the formula describing how the total amount of information depends on r, and used this formula to derive the optimal value of r (it is r = R/2) and the amount of information which can be stored under this optimal arrangement (50%). Use this formula to estimate how much information can be stored if instead of the optimal value r = R/2, we use a smaller inner radius r = R/4.

17. (Due November 11) Run an example from p. 550 of the book that shows that a C compiler does not always generate an error message when the same variable is defined in two different programs. Design a similar example and show how it runs.

18. (Due November 13) Problems 8.10, 8.11, and 8.12 from the book.

19. (Due December 2) Suppose that you have run two programs, 5 times each. The first program ran for 0.1, 0.2, 0.3, 0.4, and 0.5 msec. The second program ran for 0.01, 1.0, 2.0, 3.0, and 4.0 msec. Based on this empirical data, which of these two programs is faster? Explain your answer.

20. (Due December 4) Suppose that in the memory, you have gaps of sizes 10, 20, 30, 20, and 15. Describe what will happen when you need to add blocks of size 15, 20, and 30; trace it for ther three algorithms that we studied in class: first fit, best fit, and next fit. For the next fit algorithm, assume that originally, we were between the 30 and the next 20 gaps, and that the memory is cyclical -- i.e., that after we reach the end of the memory, we go back to the beginning of it.