Final exam, CS 5353, Fall 2008
Date: Tuesday, December 9, 2008.
Name (please print):
___________________________________________________________________________________
1. Motivations.

Explain what is cyberinfrastructure, why it is useful, and why it did not appear
many years ago, when computer communications were slower.
 Explain why we cannot directly measure
all physical quantities of interest to us, why we need data processing. Give a
reallife example.
2. Statistical foundations.
Assume that the probability
p(A) of the event A is 0.6, and the probability p(B) of the event B is 0.5.
 What is the probability p(A & B) that both A and B hold if these
events are independent? Provide the derivation of the corresponding formula.
 What is the range of possible values of p(A & B)
when we do not have any information about the dependence between A and B? Draw
examples illustrating the possibilities of the smallest and the largest values
from this range.
 What is the probability p(A \/ B) that one of the events
A or B holds if these events are independent?
Provide the derivation of the corresponding formula.
 What is the range of
possible values of p(A \/ B) when we do not have any information about the
dependence between A and B? Draw examples illustrating the possibilities of the
smallest and the largest values from this range.
3. Mathematical techniques.
 What is the Maximum Likelihood Method? Provide at least two examples of
where this method is used. Describe one of these examples in detail.
 What is bisection? Give a numerical example
(two steps are enough) of how bisection can be used to find the solution to an
equation, i.e., the value x for which f(x) = 0.
Provide an example where both Maximum Likelihood Method and bisection are used in
estimating uncertainty of the result of data processing.
4. Data fusion.
Let us assume that we have measured the same quantity with
two different measurement instruments. The result of the first measurement is
1.2, the result of the second
measurement is 0.8. Combine these two results into a single "fused" value, in
the following two situations:
 Probabilistic situation. In this situation, we assume that
both measuring instrument have 0 mean;
the first instrument has standard deviation 0.2, the second has standard deviation
0.3.
 Interval situation. In this situation, we have no information about the
probabilities, we only know that the measurement error of the first instrument
does not exceed 0.2, and the measurement error of the second instrument does not
exceed 0.3.
Where do the formulas that you used come from (no need for detailed derivations,
just explain the main ideas.)
5. Linearization.
 Illustrate, on the example of the function
y = x_{1}^{2}  x_{2},
with the measurement results
values x_{1} = 1.0 and x_{2} = 2.0 and measurement errors
Δx_{1} = 0.1 and Δx_{2} = 0.2,
what will be the error dy as estimated by the linearization method, and how
this estimated error compares with the actual value of this error. Compare
the analytical expressions for the corresponding partial derivatives with the
results of numerical differentiation.
 For the same function and the same measurement results, assume that we
only know the bounds Δ_{1} = 0.1 and Δ_{2} = 0.2 on the
measurement errors. Use the monotonicity of the function f to find the exact
range of the corresponding value y, and compare this range with the results of
applying a linearized formula for the bound Δ on the error Δy of
the result of data processing.
6. Uncertainty in data processing: computational aspects.
For the formulas for
computing uncertainty of the result of data processing,
explain how the computational complexity (= number of computational steps)
depends on the choice of the
parameters h_{i} used in numerical differentiation, and what is the choice
for which the computational complexity is the smallest:
 for the case of statistical uncertainty, when we know the standard deviations
σ_{i} of the corresponding measurement errors, and
 for the case of interval uncertainty, when we only know the upper bounds
Δ_{i} of the corresponding measurement errors.
7. Uncertainty in data processing: MonteCarlo method.
Explain why MonteCarlo method is useful in estimating uncertainty of the
result of data processing, and for what number of inputs it is useful:
 for the case of statistical uncertainty, and
 for the case of interval uncertainty.
Provide a numerical example of the number of iterations that are
needed to achieve a given accuracy.
8. Estimating reliability and trust: MonteCarlo method.
 Explain why MonteCarlo method is useful in estimating the degree of trust.
 Explain why for very
reliable components, we cannot directly use the MonteCarlo method, we need a
rescaling. Describe the main idea of the rescaling
and how it helps.
9. Reliability.
On the example of each of the following two
cases:
 case when f trusts t with probability p_{1} = 1 
Δp_{1} and t trusts s with probability p_{2} = 1 
Δp_{2};
 case when f has two reasons for trusting s: with
probability p_{1} = 1  Δp_{1} and with probability
p_{2} = 1  Δp_{2};
estimate two values:
 the worstcase
probability Δp_{w} that f does not trust s, and
 the probability Δp_{i} that f does not trust s under the
independence assumption.
10. Describe the contents of one of the class projects  different
from your own project.