CS 4365/CS 5354 Data Processing Under Security and Privacy
Summer 2016, Test 2

Name: __________________________________________________________

1-2. In class, we learned the following algorithm for eliminating outliers:

1. Use this algorithm to eliminate outliers from the following database:

The algorithm is straightforward:

2. Explain how outlier elimination is used in computer security.

3. In class, we derived a general formula for the optimal sizes Δi of a privacy-enhancing box for which the inaccuracy in the resulting estimation of a statistical characteristic C is the smallest possible under the condition of k-anonymity: Δi = c / ai, where ai is the absolute value of the partial derivative of C with respect to the i-th variable xi, and c = (1/2) * (k * a1 * a2 * ...) / ρ(x) , where ρ(x) is the data density, i.e., number of records per unit volume.

Use the general formula to find the sizes of the cell that provides the smallest possible inaccuracy in computing the covariance between weight and age for adults. Assume that:

For covariance, the derivative with respect to x1 is equal to (x2 − μ2) / N and the derivative with respect to x2 is equal to (x1 − μ1) / N, where μi are the sample means of the corresponding values.

4. What if in Problem 3, in addition to k-anonymity, we also require l-diversity, with l = 2, and the thresholds ε1 = 0.2 and ε2 = 0.01?

Reminder:

5. What percentage of privacy do we lose if someone detects the first digit x in the age xy? the second digit y? Assume that the age is between 20 and 90.

6. Based on the Problem 5, it may look like the last decimal digit -- the remainder of dividing a number by 10 -- is irrelevant, and should not be protected much. Explain that it is not so: what if someone knows the remainder by 10 and the remainder in the hex system (when divided by 16) -- can we then uniquely determine the adult age (from 20 to 90)? Explain your answer.

7. Use the efficient raising-to-the-power algorithm to compute 521 mod 11. Where is this algorithm used in RSA coding?

8-9. Use Euclid's algorithm to compute the greatest common divisor gcd(11, 21). Then find a number d for which d * 11 mod 21 = 1. Where is this used in RSA coding?

10. Briefly describe your project for this class, and what you have done so far for this project.