Final exam, CS 3335, Fall 2008
Date: Tuesday, December 9, 2008.
Name (please print): _______________________________________________________________________________
1. Binary numbers
- Take numbers 12 and 9, add and multiply them in binary and in
hex.
- Take numbers 12 and 9, transform
that into 2's complement (modulo 1024, to make sure you do not get overflow),
compute 12 + (-9), 9 + (-12), and 9 * (-12) in 2's
complement, and check the results.
4. Codes
- Security applications. Use bitwise exclusive "or" with the code
0x912 to encode the sequence 0x208,
then show how you can decode it.
- Error-correcting applications. Use the error-correcting code to
correct a
16-bit word which was originally 0x2008 but in which the first bit of 2 has
been corrupted.
5. Numerical computations.
- Show, step-by-step, how a
computer will compute 2/3. It is enough to trace the first two steps of
computing 1/3.
- Show, step-by-step, how a computer will compute the square root of 2. Again,
the first two steps are sufficient.
Explain the main idea behind these computations.
Explain how a computer computes other functions such as
sin(x), and why a similar idea cannot be used
to compute 1/x or the square root of x.
6. Parsing.
- Follow, step-by-step, the standard 2-stage
parsing algorithm (with a postfix expression as an intermediate result)
for the expression (a-b)/c + a/b; describe the resulting assembler code.
- For the expression (a-b)/c + a/b, show
how the optimizing compiler will assign registers to intermediate results.
Take into account that
repeating expressions should be only computed once.
7. Optimizing array operations.
- Describe how a (sequential) computer will optimize the computation of the
product of all the elements of a 1-D array:
- start with a C code;
- describe the resulting literal pseudo-assembly code;
- show how this literal code can be optimized:
by using array-related optimization techniques.
- Explain, based on the need to optimize array operations, why most data
types require a power of 2 bytes (2, 4, 8, etc.) to store.
8. Cash misses.
- Let us assume that we multiply two 4 x 4 matrices, and that a cache can
contain 2 blocks, each with 8 elements. Describe, step by step, all the cache
misses occurring when we run the standard algorithm for multiplying two
matrices, and all the cache misses occurring when we
run a block-based algorithm.
- Explain, step-by-step, on the example of a code for adding two 2 x 2
matrices, how a different order of loops affects the number of cache misses.
Assume that a cache has two blocks, with 2 elements each.
9. Parallelization.
- Assume that in the original program, 70% of the code is parallelizable, and
that you use a 14-processor system. What will be the resulting
speed-up?
- Explain, in detail, how the computation of the product of all the elements of
a 1-D array can be parallelized. Use an example of 4 processors and an array
of size 8.
10. Processes.
11. For extra credit: based on the need for optimization (efficiency),
explain:
- why C (and Java) uses lazy logical operations;
- why C is the most widely used language in high performance computing;
- why new processors support the operation a + b * c in hardware;
- why some algorithms "pad" the size of arrays to the nearest power of 2.