2. Why 2's complement is used to represent integers in contrast to more natural representations -- because this way, the same operation can be used to add positive and negative numbers, and any branching presents pre-fetching and thus, slows down computations.
3. Why i++ is a separate operation -- because the main time in an arithmetic operation is spend on fetching; the general addition a + b means fetching two numbers; to speed up, most computers have a special hardware supported operation for a + 1 which has only one number to fetch.
4. Why most data types require a power of 2 bytes to store: there are 4-byte types, 8-byte types, but not 6-byte types -- because the address of the i-th element a[i] of an array is a[0] + i * size. When the size is a power of two, we can use shift to compute i * size, and shift is much faster than multiplication.
6. Why bitwise operations are used and implemented in C even those they are not very common -- because they are much faster, can be done, in parallel, in 1 bit cycle.
7. Why masking is used to extract a short number from a long register and not a shift -- because masking is a bit-wise operation and is thus much faster (can be done in 1 bit cycle, while shift requires a bit-after-bit transition).
8. Why is exclusive "or" implemented at all, it is rarely used in reasoning -- because it is there anyway as a part of the adder, this is how two bits are added (and the carry is computed by using bitwise "and").
9. Why is bitwise exclusive "or" used in cryptography -- because it is a bitwise operation and thus, very fast.
10. Why is a bitwise swap used in compilers x = x ^ y, y = x ^ y, x = x ^ y, instead of the usual swap -- because there is no need to create a new location and thus, we can fit more operations into registers, the faster part of the computer memory.
11. Why C (and Java) uses lazy logical operation -- to save time: e.g., if "a" is false, then a && b is false, so there is no need to compute b.
12. Why compilers often do not implement a[i] as a[0] + i * size when implementing for-loops -- since it is faster to compute the next address as the previous one plus size, then we only need addition, no shifts, no multiplications, and addition is much faster, since a multiplication is several additions.
13. Why computers use the same algorithms as we use for addition and multiplication but use a completely different algorithm for division -- because the traditional algorithm for division requires branching, and branching drastically slows down computations.
14. Why C is the most widely used language in high performance computing -- because the purpose of high performance computing is to speed up computations, and C is the fastest language.
15. Why new processors support the operation a + b * c in hardware -- because it speeds up matrix multiplication, one of the most time-consuming computational operations.
16. Why some algorithms "pad" the size of arrays to the nearest power of 2: e.g., from 13, 14, and 15 to 16, etc. -- because, e.g., going from a[0,0] to a[1,0] requires a shift by n * size_of_an_element, where n is the size of the matrix; if n is a power of two, then we replace multiplication with a much faster shift.