Traditional decision theory describes human behavior and human preferences in terms of utility functions. In the last decades, it was shown that in many economic situations, a reasonable description of the actual decisions can be found if we use a different approach -- of spectral risk measures. In each of these approaches, we first need to empirically find the corresponding function: utility function in the traditional approach and the weighting function for spectral risk measures. Since both approaches provide a reasonable description of the same actual behavior (in particular, of the same actual economic behavior), it is desirable to be able, given utility function, to find an appropriate weighting function (and vice versa). Some empirical rules for such transition have been proposed; these rules are purely heuristic and approximate, they are not theoretically justified. In the present paper, we recall how both the utility and the risk measure approaches can be reformulated in statistical terms, and use these reformulations to provide a statistically justified transition between utility and weighting functions.
File in pdf
Random testing can eliminate subjectiveness in constructing test data and increase the diversity of test data. However, one difficult problem is to construct test oracles that decide test results---test failures or successes. Assertions can be used as test oracles and are most effective when derived from formal specifications such as OCL constraints. If fully automated, random testing can reduce the cost of testing dramatically. In this paper we propose an approach for automating Java program testing by combining random testing and OCL. The key idea of our approach is to use OCL constraints as test oracles by translating them to runtime checks written in AspectJ. We realize our approach by adapting existing frameworks for translating OCL to AspectJ and assertion-based random testing. We evaluate the effectiveness of our approach through case studies and experiments. Our approach can detect errors in both implementations and OCL constraints and provide a practical means for using OCL in design and programming.
File in PDF and in Compressed Postscript
We propose to use symmetries as a general approach to maintaining different types of uncertainty, and we show how the symmetry approach can help, especially in in economics-related applications.
File in pdf
Published in: Martine Ceberio (ed.), Abstracts of the Second Workshop on Constraint Programming and Decision Making CoProD'09, El Paso, Texas, November 9-10, 2009, pp. 56-60.
Using the problem of selecting the best location for a meteorological tower as an example, we show that in multi-objective optimization under constraints, the traditional weighted average approach is often inadequate. We also show that natural invariance requirements lead to a more adequate approach -- a generalization of Nash's bargaining solution.
File in pdf
Published in: Martine Ceberio (ed.), Abstracts of the Second Workshop on Constraint Programming and Decision Making CoProD'09, El Paso, Texas, November 9-10, 2009, pp. 20-23.
We show that in many application areas including soft constraints reasonable requirements of scale-invariance lead to polynomial (tensor-based) formulas for combining degrees (of certainty, of preference, etc.)
File in pdf
Published in: Martine Ceberio (ed.), Abstracts of the Second Workshop on Constraint Programming and Decision Making CoProD'09, El Paso, Texas, November 9-10, 2009, pp. 11-14.
In many practical situations, we must compute the value of an if-then expression f(x) defined as "if c(x) >= 0 then f+(x) else f-(x)", where f+(x), f-(x), and c(x) are computable functions. The value f(x) cannot be computed directly, since in general, it is not possible to check whether a given real number c(x) is non-negative or non-positive. Similarly, it is not possible to compute the value f(x) if the if-then function is discontinuous, i.e., when f+(x0) =/= f-(x0) for some x0 for which c(x0) = 0.
In this paper, we show that if the if-then expression is continuous, then we can effectively compute f(x).
File in pdf
Published in Applied Mathematical Sciences, 2010, Vol. 4, pp. 431-434.
Traditionally, fuzzy logic used non-standard notations like m1/x1 + ... + mn/xn for a function that attains the value m1 at x1, ..., and the value mn at xn. In this paper, we provide an algebraic explanation for these notations.
File in pdf
File in pdf
In intuitionistic fuzzy sets, there is a natural symmetry between degrees of truth and falsity. As a result, for such sets, natural similarity measures are symmetric relative to an exchange of true and false values. It has been recently shown that among such measures, the most intuitively reasonable are the ones which are also symmetric relative to an arbitrary permutation of degrees of truth, falsity, and uncertainty. This intuitive reasonableness leads to a conjecture that such permutations are not simply mathematical constructions, that these permutations also have some intuitive sense. In this paper, we show that each such permutation can indeed be represented as a composition of intuitively reasonable operations on truth values.
File in pdf
Capturing provenance about artifacts produced by distributed scientific processes is a challenging task. For example, one approach to facilitate the execution of a scientific process in distributed environments is to break down the process into components and to create workflow specifications to orchestrate the execution of these components. However, capturing provenance in such an environment, even with the guidance of orchestration logic, is difficult because of important details that may be hidden by the component abstractions. In this paper, we show how to use abstract workflows to systematically enhance scientific processes to capture provenance at appropriate levels of detail. Abstract workflows lack the specification of an orchestration logic to execute a scientific process, and instead, are intended to document scientific processes as understood by scientists. Hence, abstract workflows can be specifically designed to capture the details of scientific processes that are relevant to the scientist with respect to provenance. In addition, abstract workflows are coupled with a representation of provenance that can accommodate distributed provenance-generating source code. We also show how the approach described in this paper has been used for capturing provenance for scientific processes in the Earth science, environmental science and solar physics domains.
File in pdf
Published in: Sergei Klioner, P. Ken Seidelmann, and Michael H. Soffel (eds.), Relativity in Fundamental Astronomy, Proceedings of IAU Symposium No. 261, Cambridge University Press, Cambridge, UK, 2009, pp. 56-61.
By the early 1970s, the improved accuracy of astrometric and time measurements enabled researchers not only to experimentally compare relativistic gravity with the Newtonian predictions, but also to compare different relativistic gravitational theories (e.g., the Brans-Dicke Scalar-Tensor Theory of Gravitation). For this comparison, Kip Thorne and others developed the Parameterized Post-Newtonian Formalism (PPN), and derived the dependence of different astronomically observable effects on the values of the corresponding parameters.
Since then, all the observations have confirmed General Relativity. In other words, the question of which relativistic gravitation theory is in the best accordance with the experiments has been largely settled. This does not mean that General Relativity is the final theory of gravitation: it needs to be reconciled with quantum physics (into quantum gravity), it may also need to be reconciled with numerous surprising cosmological observations, etc. It is therefore reasonable to prepare an extended version of the PPN formalism, that will enable us to test possible quantum-related modifications of General Relativity.
In particular, we need to include the possibility of violating fundamental principles that underlie the PPN formalism but that may be violated in quantum physics, such as scale-invariance, T-invariance, P-invariance, energy conservation, spatial isotropy violations, etc. In this paper, we present the first attempt to design the corresponding extended PPN formalism, with the (partial) analysis of the relation between the corresponding fundamental physical principles.
Original file in pdf,
Updated version in pdf
In the early 1920s, Pavel Urysohn proved his famous lemma (sometimes referred to as "first non-trivial result of point set topology"). Among other applications, this lemma was instrumental in proving that under reasonable conditions, every topological space can be metrized.
A few years before that, in 1919, a complex mathematical theory was experimentally proven to be extremely useful in the description of real world phenomena: namely, during a solar eclipse, General Relativity theory -- that uses pseudo-Riemann spaces to describe space-time -- has been (spectacularly) experimentally confirmed. Motivated by this success, Urysohn started working on an extension of his lemma and of the metrization theorem to (causality-)ordered topological spaces and corresponding pseudo-metrics. After Urysohn's early death in 1924, this activity was continued in Russia by his student Vadim Efremovich, Efremovich's student Revolt Pimenov, and by Pimenov's students (and also by H. Busemann in the US and by E. Kronheimer and R. Penrose in the UK). By the 1970s, reasonably general space-time versions of Uryson's lemma and metrization theorem have been proven.
However, these 1970s results are not constructive. Since one of the main objectives of this activity is to come up with useful applications to physics, we definitely need constructive versions of these theorems -- versions in which we not only claim the theoretical existence of a pseudo-metric, but we also provide an algorithm enabling the physicist to generate such a metric based on empirical data about the causality relation. An additional difficulty here is that for this algorithm to be useful, we need a physically relevant constructive description of a causality-type ordering relation.
In this paper, we propose such a description and show that for this description, a combination of the existing constructive ideas with the known (non-constructive) proof leads to successful constructive space-time versions of the Uryson's lemma and of the metrization theorem.
File in pdf
Not all mathematical solutions to physical equations are physically meaningful: e.g., if we reverse all the molecular velocities in a breaking cup, we get pieces self-assembling into a cup. The resulting initial conditions are "degenerate": once we modify them, self-assembly stops. So, in a physical solution, the initial conditions must be "non-degenerate".
A challenge in formalizing this idea is that it depends on the representation. Example 1: we can use the Schroedinger equation to represent the potential field V(x)=F(f,...) as a function of the wave function f(x,t) and its derivatives. The new equation dF/dt=0 is equivalent to the Schroedinger equation, but now V(x) is in the initial conditions.
Example 2: for a general scalar field f, we describe a new equation which is satisfied if f satisfies the Euler-Lagrange equations for some Lagrangian L. So, similarly to Wheeler's cosmological "mass without mass", we have "equations without equations".
Thus, when formalizing physical equations, we must not only describe them in a mathematical form, we must also select one of the mathematically equivalent forms.
File in pdf
The entropy constancy principle describes the tendency for information in language to be conveyed at a constant rate. We explore the possible role of this principle in spoken dialog, using the ``summed entropy rate,'' that is, the sum of the entropies of the words of both speakers per second of time. Using the Switchboard corpus of casual dialogs and a standard ngram language model to estimate entropy, we examine patterns in entropy rate over time and the distribution of entropy across the two speakers. The results show effects that can be taken as support for the principle of constant entropy, but also indicate a need for better language models and better techniques for estimating non-lexical entropy.
File in pdf
Published in the Proceedings of the 9th International Symposium on Measurement Technology and Intelligent Instruments ISMTII'2009, St. Petersburg, Russia, June 28 - July 2, 2009, pp. 4-132 - 4-136.
The possibility of using fuzzy variables for describing measurands and their error characteristics is investigated. The elementary arithmetic operations within the limits of such representation are considered.
File in pdf
In many practical applications, it turns out to be useful to use the notion of fuzzy transform: once we have non-negative functions A1(x), ..., An(x), with A1(x) + ... + An(x) = 1, we can then represent each function f(x) by the coefficients Fi which are defined as the ratio of two integrals: of f(x) * Ai(x) and of Ai(x). Once we know the coefficients Fi, we can (approximately) reconstruct the original function f(x) as F1 * A1(x) + ... + Fn * An(x). The original motivation for this transformation came from fuzzy modeling, but the transformation itself is a purely mathematical transformation. Thus, the empirical successes of this transformation suggest that this transformation can be also interpreted in more traditional (non-fuzzy) mathematics as well.
Such an interpretation is presented in this paper. Specifically, we show that fuzzy transform has a natural probabilistic interpretation -- related to the known interpretation of fuzzy sets as equivalence classes of random sets. We also show that a similar interpretation is possible for fuzzy control techniques.
File in pdf and in Compressed Postscript
To appear in Neural Network World
Fuzzy transform is a new type of function transforms that has been successfully used in different application. In this paper, we provide a broad prospective on fuzzy transform. Specifically, we show that fuzzy transform naturally appears when, in addition to measurement uncertainty, we also encounter another type of localization uncertainty: that the measured value may come not only from the desired location x, but also from the nearby locations.
Original file in pdf
updated version in pdf
For a numerical physical quantity v, because of the measurement imprecision, the measurement result V is, in general, different from the actual value v of this quantity. Depending on what we know about the measurement uncertainty d = V - v, we can use different techniques for dealing with this imprecision: probabilistic, interval, etc.
When we measure the values v(x) of physical fields at different locations x (and/or different moments of time), then, in addition to the same measurement uncertainty, we also encounter another type of localization uncertainty: that the measured value may come not only from the desired location x, but also from the nearby locations.
In this paper, we discuss how to handle this additional uncertainty.
File in pdf and in Compressed Postscript
Published in: Dmitri A. Viattchenin (ed.), Developments in Fuzzy Clustering, Vever Publ., Minsk, Belarus, 2009, pp. 10-35.
In many application areas, there is a need for clustering, and there is a need to take fuzzy uncertainty into account when clustering. Most existing fuzzy clustering techniques are based on the idea that an object belongs to a certain cluster if this object is close to a typical object from this cluster. In some application areas, however, this idea does not work well. One example of such application is clustering in education that is used to convert a detailed number grade into a letter grade.
In such application, it is more appropriate to use clustering techniques which are based on a different idea: that an object tends to belong to the same cluster as its nearest neighbor. In this paper, we explain the relationship between this idea and dynamical systems, and we discuss how fuzzy uncertainty can be taken into account in this approach to clustering.
File in pdf and in Compressed Postscript
To appear in: Gordana Dodig-Crnkovic and Mark Burgin (eds.), Information and Computation, World Scientific, to appear.
In this paper, we analyze the problem of prediction in physics from the computational viewpoint. We show that physical paradigms like Laplace determinism, statistical determinism, etc., can be naturally explained by this computational analysis. In our explanations, we use the notions of the Algorithmic Information Theory such as Kolmogorov complexity and algorithmic randomness, as well as the novel, more physics-oriented variants of these notions.
File in pdf and in Compressed Postscript
As is well known -- the Huffman algorithm is a remarkably simple, and is a wonderfully illustrative example of the greedy method in algorithm design. However, the Huffman problem, which is to design an optimal binary character code (or an optimal binary tree with weighted leaves) is intrinsically technical, and its specification is ill-suited for students with modest mathematical sophistication.
This difficulty is circumvented by introducing an alternative 'precursor' problem that is easy to understand, and where this understanding can lead to student-devised solutions: how to merge k sorted lists of varying length together as efficiently as possible. Once students have solved this problem, they are better prepared to understand Huffman problem can be trivially reduced to it, and thus and why their merging algorithm solves it. Even the correctness argument is simplified by this approach.
File in pdf
Published in the Proceedings of the 2009 Singapore Economic Review Conference, Singapore, August 6-8, 2009.
We provide theoretical justifications for the empirical successes of (1) the asymmetric heteroskedasticity models of stochastic volatility in mathematical finance and (2) Wang's distorted probability risk measures in actuarial and investment sciences, using a unified framework of symmetry groups.
File in pdf and in Compressed Postscript
Published in Applied Mathematical Sciences, 2009, Vol. 3, No. 47, pp. 2335-2342.
In many practical situations in which the only information we have about the quantity x is that its value is within an interval [x,X], a reasonable estimate for this quantity is the geometric mean of the bounds, i.e., the square root of the product x*X. In this paper, we provide a new justification for this geometric mean heuristic.
File in pdf and in Compressed Postscript
The large scale of current and next-generation massively parallel processing (MPP) systems presents significant challenges related to fault tolerance. For applications that perform periodic checkpointing, the choice of the checkpoint interval, the period between checkpoints, can have a significant impact on the execution time of the application and the number of checkpoint I/O operations performed by the application. These two metrics determine the frequency of checkpoint I/O operations performed by the application, and thereby, the contribution of the checkpoint operations to the I/O bandwidth demand made by the application. In a computing environment where there are concurrent applications competing for access to the network and storage resources, the I/O demand of each application is a crucial factor in determining the throughput of the system. Thus, in order to achieve a good overall system throughput, it is important for the application programmer to choose a checkpoint interval that balances the two opposing metrics - the number of checkpoint I/O operations and the application execution time. Finding the optimal checkpoint interval that minimizes the wall clock execution time, has been a subject of research over the last decade. In this paper, we present a simple, elegant, and accurate analytical model of a complementary performance metric - the aggregate number of checkpoint I/O operations. We model this and present the optimal checkpoint interval that minimizes the total number of checkpoint I/O operations. We present extensive simulation studies that validate our analytical model. Insights provided by this model, combined with existing models for wall clock execution time, facilitate application programmers in making a well informed choice of checkpoint interval leading to an appropriate trade off between execution time and number of checkpoint I/O operations. We illustrate the existence of such propitious checkpoint intervals using parameters of four MPP systems, SNL's Red Storm, ORNL's Jaguar, LLNL's Blue Gene/L (BG/L), and a theoretical Petaflop system.
File in pdf
To appear in International Journal of Intelligent Technologies and Applied Statistics
One of the most widely used (and most successful) methods for pricing financial and insurance instruments under risk is the Wang transform method. In this paper, we provide a new explanation for the empirical success of Wang's method -- by providing a new simpler justification for the Wang transform.
File in pdf and in Compressed Postscript
Published in Proceedings of the 2009 World Congress of the International Fuzzy Systems Association IFSA, Lisbon, Portugal, July 20-24, 2009, pp. 1264-1269.
One of the most efficient techniques for processing interval and fuzzy data is a Monte-Carlo type technique of Cauchy deviates that uses Cauchy distributions. This technique is mathematically valid, but somewhat counterintuitive. In this paper, following the ideas of Paul Werbos, we provide a natural neural network explanation for this technique.
Original file in pdf and
in Compressed Postscript
Updated version in pdf and
in Compressed Postscript
One of the main potential applications of uncertainty in computations is quantum computing. In this paper, we show that the success of quantum computing can be explained by the fact that quantum states are, in effect, tensors.
File in pdf and in Compressed Postscript
Short version published in the Proceedings of
the 28th North American
Fuzzy Information Processing Society Annual Conference NAFIPS'09,
Cincinnati, Ohio, June 14-17, 2009; full version published in
In a typical class, we have students at different levels of knowledge, student
with different ability to learn the material. In the ideal world, we should
devote unlimited individual attention to all the students and make sure that
everyone learns all the material. In real life, our resources are finite. Based
on this finite amount of resources, what is the best way to distribute efforts
between different students?
Even when we know the exact way each student learns, the answer depends on what
is the objective of teaching the class. This can be illustrated on two extreme
example: If the objective is to leave no student behind, then in the optimal
resource arrangement all the effort goes to weak students who are behind, while
more advanced students get bored. If the effort is to increase the school's
rating by increasing the number of graduates who are accepted at top
universities, then all the effort should go to the advanced students while weak
students fail.
An additional difficulty is that in reality, we do not have exact information
about the cognitive ability of each student, there is a large amount of
uncertainty. In this talk, we analyze the problem of
optimal resource distribution under uncertainty.
Original file in pdf and
in Compressed Postscript
Published in the Proceedings of
the 28th North American
Fuzzy Information Processing Society Annual Conference NAFIPS'09,
Cincinnati, Ohio, June 14-17, 2009.
To avoid crisis developments, it is important to make financial decisions based
on the models which correct predict the probabilities of large-scale economic
fluctuations. At present, however, most financial decisions are based on Gaussian
random-walk models, models which are known to underestimate the probability of
such fluctuations. There exist better empirical models for describing these
probabilities, but economists are reluctant to use them since these empirical
models lack convincing theoretical explanations. To enhance financial stability
and avoid crisis situations, it is therefore important to provide theoretical
justification for these (more) accurate empirical models. Such a justification
is provided in this paper.
Original file in pdf and
in Compressed Postscript
Published in the Proceedings of
the 28th North American
Fuzzy Information Processing Society Annual Conference NAFIPS'09,
Cincinnati, Ohio, June 14-17, 2009.
In engineering design problems, we want to make sure that a certain
quantity c of the designed system lies within given bounds -- or
at least that the probability of this quantity to be outside these
bounds does not exceed a given threshold. We may have several such
requirements -- thus the requirement can be formulated as bounds
on the cumulative
distribution function F(x) of the quantity c; such bounds are
known as a p-box.
The value of the desired quantity c depends on the design
parameters a and the parameters b characterizing the
environment: c=f(a,b). To achieve the design goal, we need to find
the design parameters a for which the distribution F(x) for
c=f(a,b) is within the given bounds for all possible values of the
environmental variables b. The problem of computing such $a$ is
called backcalculation. For b, we also have ranges with
different probabilities -- i.e., also a p-box. Thus, we have
backcalculation problem for p-boxes.
For p-boxes, there exist efficient algorithms for finding a design
$a$ that satisfies the given constraints. The next natural question
is to find a design that satisfies additional constraints: on the
cost, on the efficiency, etc. In this paper, we prove that that without expert
knowledge, the
problem of finding such a design is computationally difficult
(NP-hard). We show that this problem is NP-hard already in the
simplest possible linearized case, when the dependence c=f(a,b) is
linear. Thus, expert (fuzzy) knowledge is needed to solve design problems under
uncertainty.
Original file in pdf and
in Compressed Postscript
To appear in International Journal of Intelligent Technologies and Applied
Statistics
Most existing econometric models such as ARCH(q) and GARCH(p,q) take
into account heteroskedasticity (non-stationarity) of time series.
However, the original ARCH(q) and GARCH(p,q) models do not take into
account the asymmetry of the market's response to positive and to
negative changes. Several heuristic modifications of ARCH(q) and
GARCH(p,q) models have been proposed that take this asymmetry into
account. These modifications turned out to be very adequate and
efficient in describing the econometric time series. In this paper,
we propose a justification of these heuristic modifications -- and
thus, an explanation of their empirical efficiency.
File in pdf and
in Compressed Postscript
A disturbing consequence of the traditional thermodynamics is the
possibility of heat death, when the Universe arrives at the
state with the largest possible value of the entropy and all the
processes will stop. In this paper, we show that one possible way to
avoid this consequence is to consider situations in which the
entropy never attains its maximum -- and thus, the heat death state
is not possible. We show that such situations can have physical
sense -- e.g., they naturally appear in boostrap models.
File in pdf and
in Compressed Postscript
Updated version in pdf and
in Compressed Postscript
Technical Report UTEP-CS-09-04, January 2009
Updated version UTEP-CS-09-04a, March 2009
Empirical Formulas for Economic Fluctuations:
Towards A New Justification
Tanja Magoc and Vladik Kreinovich
Updated version in pdf and
in Compressed Postscript
Technical Report UTEP-CS-09-03, January 2009
Updated version UTEP-CS-09-03a, April 2009
Expert Knowledge Is Needed for Design under Uncertainty:
For p-Boxes, Backcalculation is, in General, NP-Hard
Vladik Kreinovich
Updated version in pdf and
in Compressed Postscript
Technical Report UTEP-CS-09-02, January 2009
Asymmetric Heteroskedasticity Models: A New Justification
Songsak Sriboonchitta and Vladik Kreinovich
Technical Report UTEP-CS-09-01, January 2009
A Possible Way to Avoid Heat Death
Nitaya Buntao,
Narunchara Katemee, and
Vladik Kreinovich