Computer Science Department

Abstracts of 2011 Reports

Technical Report UTEP-CS-11-65, December 2011

Efficient Approximation for Security Games with Interval Uncertainty

Christopher Kiekintveld and Vladik Kreinovich Published in

There are an increasing number of applications of security games. One of the key challenges for this field going forward is to address the problem of model uncertainty and the robustness of the game-theoretic solutions. Most existing methods for dealing with payoff uncertainty are Bayesian methods which are NP-hard and have difficulty scaling to very large problems. In this work we consider an alternative approach based on interval uncertainty. For a variant of security games with interval uncertainty we introduce a polynomial-time approximation algorithm that can compute very accurate solutions within a given error bound.

Technical Report UTEP-CS-11-64, December 2011

Updated version UTEP-11-64a, March 2012

Assessment of Functional Impairment in Human Locomotion: A Fuzzy-Motivated Approach

Murad Alaqtash, Thompson Sarkodie-Gyan, and Vladik Kreinovich

Published in *Proceedings of the 2012 Annual
Conference of the North American Fuzzy Information Processing Society
NAFIPS'2012*, Berkeley, California, August 6-8, 2012.

Many neurological disorders result in disordered motion. The effects of a disorder can be decrease by an appropriate rehabilitation. To make rehabilitation efficient, we need to monitor the patient and check how well he or she improves. In our previous papers, we proposed a fuzzy-based semi-heuristic method of gauging how well a patient improved. Surprisingly, this semi-heuristic method turned out to be more efficient that we expected. In this paper, we provide a justification for this efficiency.

In
the future, it is desirable to combine this fuzzy-*assessment*
approach with results by
Alavarez-Alvarez, Trivino, and Cordon who use fuzzy techniques
for *modeling* human gait.

Original file in pdf

updated version in pdf

Technical Report UTEP-CS-11-63, December 2011

Constraint Optimization: From Efficient Computation of What Can Be Achieved to Efficient Computation of a Way to Achieve The Corresponding Optimum

Ali Jalal-Kamali, Martine Ceberio, and Vladik Kreinovich

To appear in *Proceedings of the Fifth
International Workshop on Constraint Programming and Decision
Making CoProD'12*, Novosibirsk, Russia, September 23, 2012.

In many practically useful cases, we know how to efficiently
compute the
exact range of a function over given intervals
(and, possibly, under additional constraints). In other words,
we know how to efficiently compute the minimum and maximum of a given function
f(x_{1}, ..., x_{n}) on any box. From the practical viewpoint,
it is important not only to find the value of the corresponding
maximum or minimum, but also to know for what values of the parameters
x_{i} this optimum is attained. We prove a general result: that
if we can efficiently compute the optimum, then we can also
efficiently find the values at which this optimum is attained.

Technical Report UTEP-CS-11-62, December 2011

Updated version UTEP-CS-11-62a, April 2011

Why Bernstein Polynomials Are Better: Fuzzy-Inspired Justification

Jaime Nava, Olga Kosheleva, and Vladik Kreinovich

Published in *Proceedings of the 2012 IEEE World Congress on
Computational Intelligence WCCI'2012*, Brisbane, Australia,
June 10-15, 2012, pp. 1986-1991.

It is well known that an arbitrary continuous function on a
bounded set -- e.g., on an interval [a,b] -- can be, with any
given accuracy, approximated by a polynomial. Usually,
polynomials are described as linear combinations of monomials.
It turns out that in many computational problems, it is more
efficient to represent a polynomial as *Bernstein
polynomials* -- e.g., for functions of one variable, a linear
combination of terms (x-a)^{k} * (b-x)^{n-k}. In this paper,
we provide a simple fuzzy-based explanation of why Bernstein
polynomials are often more efficient, and we show how this
informal explanation can be transformed into a precise
mathematical explanation.

Original file in pdf

updated version in pdf

Technical Report UTEP-CS-11-61, December 2011

Updated version UTEP-CS-11-61a, April 2012

Semi-Heuristic Poverty Measures Used by Economists: Justification Motivated by Fuzzy Techniques

Karen Villaverde, Nagwa Albehery, Tonghui Wang, and Vladik Kreinovich

Published in *Proceedings of the 2012 IEEE World Congress on
Computational Intelligence WCCI'2012*, Brisbane, Australia,
June 10-15, 2012, pp. 1603-1609.

To properly gauge the extent of poverty in a country or in a region, economists use semi-heuristic poverty measures such as the Foster-Greer-Thorbecke (FGT) metric. These measures are used because it was empirically shown that they capture the commonsense meaning of the extent of poverty better than previously proposed measures. However, the fact that these measures are better then a few earlier proposed ones does not necessarily mean that these measures are the best possible; so, it is desirable to look for optimal poverty measures. In this paper, we first use fuzzy techniques to provide a commonsense interpretation of FGT poverty measures, and then show that under certain reasonable conditions, these measures are indeed optimal.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-60, December 2011

Revised version UTEP-CS-11-60a, March 2012

Towards Formalizing Non-Monotonic Reasoning in Physics: Logical Approach Based on Physical Induction and Its Relation to Kolmogorov Complexity

Vladik Kreinovich

Published in:
Esra Erdem, Joohyung Lee, Yuliya Lierler, and David Pearce
(eds.), *Correct Reasoning: Essays on Logic-Based AI in Honor of
Vladimir Lifschitz*, Springer Lecture Notes on Computer Science,
2012, Vol. 7265, pp. 390-404.

To formalize some types of non-monotonic reasoning in physics, researchers have proposed an approach based on Kolmogorov complexity. Inspired by Vladimir Lifschitz's belief that many features of reasoning can be described on a purely logical level, we show that an equivalent formalization can be described in purely logical terms: namely, in terms of physical induction.

One of the consequences of this formalization is that the set of not-abnormal states is (pre-)compact. We can therefore use Lifschitz's result that when there is only one state that satisfies a given equation (or system of equations), then we can algorithmically find this state. In this paper, we show that this result can be extended to the case of approximate uniqueness.

Original file in pdf

updated file in pdf

Technical Report UTEP-CS-11-59, December 2011

In the Beginning Was the Word, and the Word was Fuzzy

Vladik Kreinovich

Published in: Rudolf Seising, Enric Trillas, Claudio Moraga,
and Settimo Termini (eds.), *On Fuzziness. A Homage to Lotfi
A. Zadeh*, Springer Verlag, Berlin, New York, 2013, pp.
337-341.

Technical Report UTEP-CS-11-58, November 2011

VisKo: Semantic Web Support for Information and Science Visualization

Nicholas Del Rio and Paulo Pinheiro da Silva

Specialized knowledge in visualization software packages such as Visualization Toolkit (VTK), Generic Mapping Tools (GMT), and NCAR Command Language (NCL) is almost always a requisite for writ- ing applications that visualize datasets or information. Technical under- standing of visualization packages including rendering capabilities and data format ingestion is needed before it can be determined whether a package can satisfy some set of visualization requirements. Even after identification of satisfactory visualization packages, an application must still be built on packages that generate the required visualizations. Visualization Knowledge (VisKo) modularized ontology set encodes knowledge about visualization software packages using semantic Web technologies (e.g., RDF, OWL, and Pellet) in order to facilitate the scientist task of visualizing data and information. In the presence of VisKo, users may move away from writing visualization applications and move towards an environment where they can request for visualizations declaratively and without being fully aware of a wide range of visualization package implementation details.

The use of VisKo ontologies is supported by a fully implemented environment composed of a query language, a visualization knowledge base, and a set of visualization services. The VisKo ontology and supporting environment has been in use by many scientific initiatives including visualizing artifacts associated with a seismic tomography, gravity anomalies, and brightness temperature.

Technical Report UTEP-CS-11-57, November 2011

Revised version UTEP-CS-11-57b, November 2012

Estimating Correlation under Interval Uncertainty

Ali Jalal-Kamali and Vladik Kreinovich

Published in *Mechanical Systems and Signal Processing*,
2013, Vol. 37, pp. 43-53.

In many engineering situations, we are interested in finding
the correlation ρ between different quantities x and y
based on the values x_{i} and y_{i}
of these quantities measured in
different situations i. Measurements are never absolutely
accurate; it is therefore necessary to take this inaccuracy
into account when estimating the correlation ρ. Sometimes,
we know the probabilities of different values of measurement
errors, but in many cases, we only know the upper bounds
Δ_{xi} and Δ_{yi} on the corresponding
measurement errors. In such situations, after we get the
measurement results X_{i} and Y_{i}, the
only information that we have about the actual (unknown) values
x_{i} and y_{i} is that they belong to the corresponding intervals
[X_{i} - Δ_{xi}, X_{i} +
Δ_{xi}] and
[Y_{i} - Δ_{yi}, Y_{i} + Δ_{yi}].
Different values from these intervals lead, in general, to
different values of the correlation ρ. It is therefore
desirable to find the range [ρ^{-}, ρ^{+}] of
possible values of the correlation when x_{i} and y_{i} take
values from the corresponding intervals. In general, the
problem of computing this range is NP-hard. In this paper, we
provide a feasible (= polynomial-time) algorithm for computing at
least one of the endpoints of this interval: for computing
ρ^{+} when ρ^{+} > 0 and for computing
ρ^{-} when ρ^{-} < 0.

Original file in pdf

Revised version in pdf

Technical Report UTEP-CS-11-56, November 2011

Constraint Problems: Computability Is Equivalent to Continuity

Martine Ceberio and Vladik Kreinovich

To appear in *International Journal
of Intelligent Technologies and Applied Statistics (IJITAS)*

In many practical situations, we would like to compute the set of all possible values that satisfy given constraints. It is known that even for computable (constructive) constraints, computing such set is not always algorithmically possible. One reason for this algorithmic impossibility is that sometimes, the dependence of the desired set on the parameters of the problem is not continuous, while all computable functions of real variables are continuous. In this paper, we show that this discontinuity is the only case when the desired set cannot be computed. Specifically, we provide an algorithm that computes such a set for all the cases when the dependence is continuous.

Technical Report UTEP-CS-11-55, October 2011

Propagating Range (Uncertainty) and Continuity Information Through Computations: From Real-Valued Intervals to General Sets

Vladik Kreinovich

One of the main problems of interval computations is to find an
enclosure Y that contains the range f(X_{1}, ..., X_{n}) of a given
function f(x_{1}, ..., x_{n}) over given
intervals X_{1}, ..., X_{n}. Most
of the techniques for estimating this range are based on
propagating the range through computations. Specifically, we
follow the computations of f(x_{1}, ..., x_{n})
step-by-step: we
start with ranges X_{1}, ..., X_{n} of the inputs, and then we
sequentially compute the enclosures for the ranges of all
intermediate results, until, on the last computation step, we
get the desired enclosure Y.

A similar propagation of "decorations" -- information about
continuity -- enables us to make conclusions about the
continuity of the resulting function f(x_{1}, ..., x_{n}).
In this paper,
we show that the interval
propagation results can be naturally extended to the general
case of arbitrary sets. For this general case, we provide
necessary and sufficient conditions for such a propagation.

Technical Report UTEP-CS-11-54, September 2011

Updated version UTEP-CS-11-54b, February 2012

Towards Optimizing Cloud Computing: An Example of Optimization under Uncertainty

Vladik Kreinovich

Published in: Samee U. Khan, Albert Y. Zomaya, and
Lizhe Wang (eds.), *Scalable Computing and
Communications: Theory and Practice*, John Wiley &
Sons and IEEE Computer Society Press, 2013, pp. 613-627.

One of the most efficient way to store and process data is *cloud computing*.
In cloud computing, instead of storing the data at the user-defined
location (e.g., at the user's computer or at the centralized server),
the computer system ("cloud") selects the location of the data storage that
speeds up computations -- by minimizing the (average) communication time.
In this chapter, we provide an analytical solution to the corresponding optimization
problem.

The demand for cloud computing is growing fast, and we expect that this demand -- and thus, the size of the resulting cloud -- will continue to grow. To avoid expensive frequent redesigns of the cloud, it is therefore desirable to make sure that the resulting servers will have enough capacity to satisfy future demand -- and at the same time that we do not build in expensive extra capacity that will not be used in the predictable future. It is thus important to be able to predict the future demand for cloud computing -- i.e., predict how the cloud will grow. In this chapter, we describe how to optimally predict the cloud growth.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-53, September 2011

A Heuristic for Selecting the Cycle Length for FAIRIO

S. Arunagiri, Y. Kwok, P. J. Teller, R. Portillo, and S. R. Seelam

FAIRIO is a cycle-based I/O scheduling algorithm that provides differentiated service to workloads concurrently accessing a consolidated RAID storage system. FAIRIO enforces proportional sharing of I/O service through fair scheduling of disk time. During each cycle of the algorithm, I/O requests are scheduled according to workload weights and disk-time utilization history. There are several parameters in the FAIRIO scheduler that can affect its behavior. One parameter in particular, the length of a scheduling cycle, can affect the scheduler's ability to react to and compensate for changes in workload access characteristics. Unfortunately, there is no cycle length that is optimal for all workloads and, thus, it must be chosen according to the workload environment. This technical report describes a heuristic that can be used to choose a favorable cycle length that promotes performance differentiation given a priori knowledge of workload access behavior. The heuristic is validated using simulations driven by several real and synthetic workload scenarios that represent a broad range of request types, sizes, and access characteristics. We show that when workload weights properly map to workload requirements, cycle lengths deduced by our heuristic promote differentiated disk-time sharing within 3.9% of perfect proportionality to workload weights.

Technical Report UTEP-CS-11-52, August 2011

The CleanJava Language for Functional Program Verification

Yoonsik Cheon, Cesar Yeep, and Melisa Vela

Unlike Hoare-style program verification, functional program verification supports forward reasoning by viewing a program as a mathematical function from one program state to another and proving its correctness by essentially comparing two mathematical functions, the function computed by the program and its specification. Since it requires a minimal mathematical background and reflects the way that programmers reason about the correctness of a program informally, it can be taught and practiced effectively. However, there is no formal notation supporting the functional program verification. In this article, we describe a formal notation for writing functional program specifications for Java programs. The notation, called CleanJava, is based on the Java expression syntax and is extended with a mathematical toolkit consisting of sets and sequences. The vocabulary of CleanJava can also be enriched by introducing user-specified definitions such as user-defined mathematical functions and specification-only methods. We believe that CleanJava is a good notation for writing functional specifications and expect it to promote the use of functional program verifications by being able to specify a wide range of Java programs.

Technical Report UTEP-CS-11-51, August 2011

How to Encourage Imperfect Individuals to Care More about Society in General: a Utility-Theory Approach

Vladik Kreinovich

Published in *Applied Mathematical Sciences*, 2012, Vol.
6, No. 13, pp. 645-649.

For a society to function efficiently, it is desirable that all members of this society care no only about themselves, but also about the society as a whole, i.e., about all the other individuals from the society. In practice, most people are only capable of caring about a few other individuals. We analyze this problem from the viewpoint of decision theory and show that even with such imperfect individuals, it is possible to make sure that everyone's decisions are affected by the society as a whole: namely, it is sufficient to make sure that people have emotional attachment to those few individuals who are capable of caring about the society as a whole.

As a side effect, our result provides a possible explanation of why the Biblical commandment to love your God encourages ethical behavior.

Technical Report UTEP-CS-11-50, August 2011

Updated version UTEP-CS-11-50a, May 2012

I-Complexity and Discrete Derivative of Logarithms: A Symmetry-Based Explanation

Vladik Kreinovich and Jaime Nava

Published in *Journal of Uncertain Systems*, 2012, Vol. 6,
No. 2, pp. 118-121.

In many practical applications, it is useful to consider Kolmogorov complexity K(s)
of a given string s,
i.e., the shortest length of a program that generates this string. Since Kolmogorov complexity
is, in general, not computable, it is necessary to use computable approximations K_{~}(s) to
K(s). Usually, to describe such an approximations, we take a compression algorithm and
use the length of the compressed string as K_{~}(s). This approximation, however, is not
perfect: e.g., for most compression algorithms, adding a single bit to the string $s$ can drastically change the value
K_{~}(s) -- while the actual Kolmogorov complexity only changes slightly. To avoid this problem,
V. Becher and P. A. Heiber proposed a new approximation called I-complexity. The formulas for this approximation
depend on selecting an appropriate function F(x). Empirically, the function F(x) = log(x) works the best.
In this paper, we show that this empirical fact can be explained if we take in account the corresponding symmetries.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-49, August 2011

How Accurately Should We Write on the Board? When Marking Comments on Student Papers?

Martine Ceberio and Olga Kosheleva

Published in *Journal of Uncertain Systems*, 2012, Vol. 6,
No. 2, pp. 89-91.

Writing on the board is an important part of a lecture. Lecturers' handwriting is not always perfect. Usually, a lecturer can write slower and more legibly, this will increase understandability but slow down the lecture. In this paper, we analyze an optimal trade-off between speed and legibility.

Technical Report UTEP-CS-11-48, August 2011

Efficient Geophysical Technique of Vertical Line Elements as a Natural Consequence of General Constraints Techniques

Rolando Cardenas and Martine Ceberio

Published in *Journal of Uncertain Systems*, 2012, Vol. 6, No. 2,
pp. 86--88.

One of the main objectives of geophysics is to find how density d and other physical characteristics depend on a 3-D location (x,y,z). In general, in numerical methods, a way to find the dependence d(x,y,z) is to discretize the space, and to consider, as unknown, e.g., values d(x,y,z) on a 3-D rectangular grid. In this case, the desired density distribution is represented as a combination of point-wise density distributions. In geophysics, it turns out that a more efficient way to find the desired distribution is to represent it as a combination of thin vertical line elements that start at some depth and go indefinitely down. In this paper, we show that the empirical success of such vertical line element techniques can be naturally explained if we recall that, in addition to the equations which relate the observations and the unknown density, we also take into account geophysics-motivated constraints.

Technical Report UTEP-CS-11-47, August 2011

A New Justification for Weighted Average Aggregation in Fuzzy Techniques

Jaime Nava

To appear in *Journal of Uncertain Systems*, 2012, Vol. 6, No. 2.

In many practical situations, we need to decide whether a given
solution is good enough, based on the degrees a_{i} to which different criteria
are satisfied. In this paper, we show that natural requirements lead to the
weighted average decision, according to which a solution is acceptable if
w_{1} * a_{1} + ... + w_{n} * a_{n} > t
for some weights w_{i} and threshold t.

Technical Report UTEP-CS-11-46, August 2011

Updated version UTEP-CS-11-46a, March 2012

Prediction in Econometrics: Towards Mathematical Justification of Simple (and Successful) Heuristics

Vladik Kreinovich, Hung T. Nguyen, and Songsak Sriboonchitta

Preliminary version published in *Proceedings of the Fifth
International Conference of the Thailand Econometric
Society*, Chiang Mai, Thailand, January 12-13, 2012; full
version published in *International Journal of Intelligent
Technologies and Applied Statistics (IJITAS)*, 2012, Vol. 5,
No. 4, pp. 443-460.

Many heuristic and semi-heuristic methods have been proposed to predict economic and financial processes. Some of these heuristic processes are intuitively reasonable, some seemingly contradict to our intuition. The success of these heuristics leads to a reasonable conjecture that these heuristic methods must have a more fundamental justification. In this paper, we provide such a justification for two simple (and successful) prediction heuristics: of an intuitive exponential smoothing that provides a reasonable prediction for slowly changing processes, and of a seemingly counter-intuitive idea of an increase in volatility as a predictor of trend reversal. As a possible application of these ideas, we consider a new explanation of the price transmission phenomenon.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-45, August 2011

Revised version UTEP-CS-11-45b, January 2013

A Simple Physics-Motivated Equivalent Reformulation of P=NP that Makes This Equality (Slightly) More Plausible

Jaime Nava and Vladik Kreinovich

Published in *Cybernetics and Physics*, 2012, Vol. 1,
No. 4, pp. 287-291.

In our opinion, one of the reasons why the problem
P=NP? is so difficult is that while there are
good intuitive arguments in favor of P=/=NP, there is a lack
of intuitive arguments in favor of P=NP. In this paper, we
provide such an argument -- based on the fact that in physics,
many dependencies are scale-invariant, their expression does
not change if we simply change the unit in which we measure the
corresponding input quantity (e.g., replace meters by
centimeters). It is reasonable to imagine similar behavior for
time complexity t_{A}(n) of algorithms A: that the form of
this dependence does not change if we change change the unit in
which we measure the input length (e.g., from bits to bytes).
One can then easily prove that the existence of such
scale-invariant algorithms for solving, e.g., propositional
satisfiability is equivalent to P=NP. This equivalent
reformulation of the formula P=NP is, in our opinion, much
more intuitively reasonable than the original formula -- at
least to those who are familiar with the importance of
scale-invariance in physics.

Original file in pdf

Revised version in pdf

Technical Report UTEP-CS-11-44, August 2011

High-Concentration Chemical Computing Techniques for Solving Hard-To-Solve Problems, and Their Relation to Numerical Optimization, Neural Computing, Reasoning Under Uncertainty, and Freedom Of Choice

Vladik Kreinovich and Olac Fuentes

Published in: Evgeny Katz (ed.), *Molecular and Supramolecular
Information
Processing: From Molecular Switches to Logical Systems*,
Wiley-VCH, Wienheim, Germany, 2012, pp. 209-235.

Chemical computing -- using chemical reactions to perform computations -- is a promising way to solve computationally intensive problems. Chemical computing is promising because it has the potential of using up to 10^{23} molecules as processors working in parallel -- and thus, has a potential of an enormous speedup. Unfortunately, for hard-to-solve (NP-complete) problems a natural chemical computing approach for solving them is exponentially slow. In this chapter, we show that the corresponding computations can become significantly faster if we use very-high-concentration chemical reactions, concentrations at which the usual equations of chemical kinetics no longer apply. We also show that the resulting method is related to numerical optimization, neural computing, reasoning under uncertainty, and freedom of choice.

Technical Report UTEP-CS-11-43, August 2011

Updated version UTEP-CS-11-43a, August 2011

All Kinds of Behavior are Possible in Chemical Kinetics: A Theorem and Its Potential Applications to Chemical Computing

Vladik Kreinovich

Published in: Evgeny Katz (ed.), *Molecular and Supramolecular
Information
Processing: From Molecular Switches to Logical Systems*,
Wiley-VCH, Wienheim, Germany, 2012, pp. 237-258.

Until the late 1950s, it was believed that the processes described by the equations of chemical kinetics are simple: in the course of each chemical reaction, concentrations of some chemical substances decrease while concentrations of other substances increase. This belief was shattered when the first periodic reaction -- the famous Belousov-Zhabotinsky reaction -- was discovered. Since then, it was shown that many other types of unusual behavior are possible for chemical systems. This discovery led to the possibility of finding chemical reactions that emulate non-trivial transformations that occur during computations -- and thus, perform computations "in vitro", by actually performing the corresponding chemical reactions. The potential advantages of such chemical computing are numerous; the main advantage is that with 10^{23} molecules performing computations in parallel, we have a potential for an unheard-of-parallelization -- and thus, of an unheard-of speed-up. The possibility of computing "in vitro" was at first only theoretically conjectured, but then, in 1994, L. Adleman has actually performed successful chemical computations. This started a current boom in chemical computing, with many new ideas and devices appearing all the time.

From both practical and theoretical viewpoints, chemical
computing has been a clear success story. However, one open
problem remained in this area: while *many* types of
behavior have been shown to occur in chemical kinetics, it has
been not know whether *all* types of behavior are possible.
In this paper, we prove that every possible behavior can indeed
be implemented in an appropriate chemical kinetics system. This
result has the following direct implication for chemical
computing: no matter what computational device one invents,
with whatever weird behavior, it is, in principle, possible to
emulate this device by appropriate chemical reactions. In this
sense, chemical computing is truly ubiquitous.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-42, August 2011

Linear Neural Networks Revisited: From PageRank to Family Happiness

Vladik Kreinovich

The study of Artificial Neural Networks started with the analysis of linear neurons. It was then discovered that networks consisting only of linear neurons cannot describe non-linear phenomena. As a result, most currently used neural networks consist of non-linear neurons. In this paper, we show that in many cases, linear neurons can still be successfully applied. This idea is illustrated by two examples: the PageRank algorithm underlying the successful Google search engine and the analysis of family happiness.

Technical Report UTEP-CS-11-41, August 2011

Updated version UTEP-CS-11-41b, August 2013

Density-Based Fuzzy Clustering as a First Step to Learning Rules: Challenges and Possible Solutions

Gozde Ulutagay and Vladik Kreinovich

Published in: Mo Jamshidi, Vlaidk Kreinovich, and Janusz Kacprzyk
(eds.), *Proceedings of 3rd World Conference on Soft
Computing*, San Antonio, December 15-18, 2013, pp. 357-372.

In many practical situations, it is necessary to cluster given situations, i.e., to divide them into groups so that situations within each group are similar to each other. This is how we humans usually make decisions: instead of taking into account all the tiny details of a situation, we classify the situation into one of the few groups, and then make a decision depending on the group containing a given situation. When we have many situations, we can describe the probability density of different situations. In terms of this density, clusters are connected sets with higher density separated by sets of smaller density. It is therefore reasonable to define clusters as connected components of the set of all the situations in which the density exceeds a certain threshold t. This idea indeed leads to reasonable clustering. It turns out that the resulting clustering works best if we use a Gaussian function for smoothing when estimating the density, and we select a threshold in a certain way. In this paper, we provide a theoretical explanation for this empirical optimality. We also show how the above clustering algorithm can be modified so that it takes into account that we are not absolutely sure whether each observed situation is of the type in which we are interested, and takes into account that some situations "almost" belong to a cluster.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-40, July 2011

Updated version UTEP-CS-11-40a, August 2013

Why A Model Produced by Training a Neural Network Is Often More Computationally Efficient than a Nonlinear Regression Model: A Theoretical Explanation

Jaime Nava and Vladik Kreinovich

Published in *Journal of Uncertain Systems*, 2014, Vol. 8,
No. 3, pp. 193-204.

Many real-life dependencies can be reasonably accurately described by linear functions. If we want a more accurate description, we need to take non-linear terms into account. To take nonlinear terms into account, we can either explicitly add quadratic terms to the regression equation, or, alternatively, we can train a neural network with a non-linear activation function. At first glance, regression algorithms lead to simpler expressions, but in practice, often, a trained neural network turns out to be a more computationally efficient way of predicting the corresponding dependence. In this paper, we provide a reasonable explanation for this empirical fact.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-39, July 2011

Maximum Likelihood Approach to Pointwise Estimation in Statistical Data Processing under Interval Uncertainty

Nitaya Buntao, Sa-aat Niwitpong, and Vladik Kreinovich

Traditional statistical estimates
C(x_{1}, ..., x_{n}) for different statistical
characteristics (such as mean, variance, etc.) implicitly assume
that we know
the sample values x_{1}, ..., x_{n} exactly. In
practice, the sample values
X_{i} come from measurements and are, therefore, in general, different from the
actual (unknown) values X_{i} of the corresponding quantities.
Sometimes, we
know the probabilities of different values of the measurement error
ΔX_{i} = X_{i} - x_{i}, but often,
the only information that we have
about the measurement error is the upper bound Δ_{i}
on its absolute value --
provided by the manufacturer of the corresponding measuring instrument. In this case,
the only information that we have about the actual values x_{i} is that
they belong to the intervals
[X_{i} - Δ_{i}, X_{i} + Δ_{i}].

In general, different
values x_{i} from the corresponding interval
[X_{i} - Δ_{i}, X_{i} + Δ_{i}]
lead to
different values of the corresponding statistical characteristic
C(x_{1}, ..., x_{n}). In this case, it is desirable to
find the set of all possible values of this characteristic. For continuous estimates
C(x_{1}, ..., x_{n}), this range is an interval.

The values of C are used, e.g., in decision making -- e.g., in a control problem, to select an appropriate control value. In this case, we need to select a single value from the corresponding interval. It is reasonable to select a value which is, in some sense, the most probable. In this paper, we show how the Maximum Likelihood approach can provide such a value, i.e., how it can produce pointwise estimates in statistical data processing under interval uncertainty.

Technical Report UTEP-CS-11-38, July 2011

Updated version UTEP-11-38a, January 2012

Towards Symmetry-Based Explanation of (Approximate) Shapes of Alpha-Helices and Beta-Sheets (and Beta-Barrels) in Protein Structure

Jaime Nava and Vladik Kreinovich

Published in *Symmetry*, 2012, Vol. 4, No. 1, pp. 15-25.

Protein structure is invariably connected to protein function. There are two important secondary structure elements: alpha helices and beta-sheets (which sometimes come in a shape of beta-barrels). The actual shapes of these structures can be complicated, but in the first approximation, they are usually approximated by, correspondingly, cylindrical spirals and planes (and cylinders, for beta-barrels). In this paper, following the ideas pioneered by a renowned mathematician M. Gromov, we use natural symmetries to show that, under reasonable assumptions, these geometric shapes are indeed the best approximating families for secondary structures.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-37, July 2011

Theoretical Explanation of Bernstein Polynomials' Efficiency: They Are Optimal Combination of Optimal Endpoint-Related Functions

Jaime Nava and Vladik Kreinovich

In many applications of interval computations, it turned out to
be beneficial to represent polynomials on a given interval
[x_{-}, x_{+}] as linear combinations of
*Bernstein* polynomials
(x- x_{ - })^{k} * (x_{+} - x)^{n-k}.
In this paper, we provide a theoretical
explanation for this empirical success: namely, we show that
under reasonable optimality criteria, Bernstein polynomials can
be uniquely determined from the requirement that they are
optimal combinations of optimal polynomials corresponding to
the interval's endpoints.

Technical Report UTEP-CS-11-36, July 2011

Towards Fast and Accurate Algorithms for Processing Fuzzy Data: Interval Computations Revisited

Gang Xiang and Vladik Kreinovich

Published in *International Journal of General Systems*,
2013, Vol. 42, No. 2, pp. 197-223.

In many practical applications, we need to process data -- e.g., to predict the future values of different quantities based on their current values. Often, the only information that we have about the current values comes from experts, and is described in informal ("fuzzy") terms like "small". To process such data, it is natural to use fuzzy techniques, techniques specifically designed by Lotfi Zadeh to handle such informal information.

In this survey, we start by revisiting the motivation behind
Zadeh's formulas for processing fuzzy data, explain how the
algorithmic problem of processing fuzzy data can be described
in terms of interval computations (alpha-cuts). Many fuzzy
practitioners claim ``I tried interval computations, they did
not work" -- meaning that they got estimates which are much
wider than the desired alpha-cuts. We show that such
statements are usually based on a (widely spread)
misunderstanding -- that interval computations simply means
replacing each arithmetic operation with the corresponding
operation with intervals. We show that while such
*straightforward* interval techniques indeed often lead to
over-wide estimates, the current *advanced* interval
computations techniques result in estimates which are much more
accurate.

We overview such advanced interval computations techniques, and show that by using them, we can efficiently and accurately process fuzzy data.

We wrote this survey with three audiences in mind. First, we want fuzzy researchers and practitioners to understand the current advanced interval computations techniques and to use them to come up with faster and more accurate algorithms for processing fuzzy data. For this "fuzzy" audience, we explain these current techniques in detail. Second, we also want interval researchers to better understand this important application area for their techniques. For this "interval" audience, we want to explain where fuzzy techniques come from, what are possible variants of these techniques, and what are the problems to which interval techniques can be applied. These readers needs to avoid their own frequent misunderstanding -- that fuzzy techniques are "magical" heuristic tools that are only justified by intuition and that have no mathematical justification. Readers of both types can skip the parts they already know.

Finally, we also want to target people who solve practical data processing problems -- and who may not be well familiar neither with the fuzzy nor with the interval techniques. We want these readers to get an understanding of both the problem of processing fuzzy data and of the interval techniques for solving this problem.

Technical Report UTEP-CS-11-35, June 2011

First revised version UTEP-CS-11-35b, August 2012

Second version UTEP-CS-11-35c, September 2012

Final version UTEP-CS-11-35d, January 2013

Is It Possible to Have a Feasible Enclosure-Computing Method Which Is Independent of the Equivalent Form?

Marcin Michalak and Vladik Kreinovich

Published in *Reliable computing*, 2013, Vol. 18, pp. 1-8.

The problem of computing the range [y] of a given function f(x1, ..., xn) over given intervals [xi] -- often called the main problem of interval computations -- is, in general, NP-hard. This means that unless P = NP, it is not possible to have a feasible (= polynomial time) algorithm that always computes the desired range. Instead, interval computations algorithms compute an enclosure [Y] for the desired range. For all known feasible enclosure-computing methods -- starting with straightforward interval computations -- there exist two expressions f(x1, ..., xn) and g(x1, ..., xn) for computing the same function that lead to different enclosures. We prove that, unless P = NP, this is inevitable: it is not possible to have a feasible enclosure-computing method which is independent of the equivalent form.

Original file in pdf

First revised version in pdf

Second revised version in pdf

Final version in pdf

Technical Report UTEP-CS-11-34, June 2011

Orthogonal Bases Are the Best: A Theorem Justifying Bruno Apolloni's Heuristic Neural Network Idea

Jaime Nava and Vladik Kreinovich

Published in *Journal of Uncertain Systems*, 2012, Vol. 6,
No. 2, pp. 122-127.

One of the main problems with neural networks is that they are often very slow in learning the desired dependence. To speed up neural networks, Bruno Apolloni proposed to othogonalize neurons during training, i.e., to select neurons whose output functions are orthogonal to each other. In this paper, we use symmetries to provide a theoretical explanation for this heuristic idea.

Technical Report UTEP-CS-11-33, June 2011

Tropical (Idempotent) Algebras as a Way to Optimize Fuzzy Control

Jaime Nava

Published in *International Journal of Innovative Management,
Information & Production (IJIMIP)*, 2013, Vol. 4, No. 1, pp.
72-87.

Fuzzy control is a methodology that
transforms control rules (described by an expert in words of a natural
language) into a precise control strategy. There exist several versions
of this transformation. The main difference between these versions is in
how they interpret logical connectives "and" and "or", i.e., in
other words, what *reasoning method* a version uses. Which of these
versions should we choose? It turns out that on different stages of
control, different reasoning methods lead to better control results.
In this paper, we describe the choice of reasoning methods that optimize
control results in terms of smoothness and stability. It turns out that reasoning methods
which are optimal on each stage correspond to *tropical algebras* -- algebras isomorphic to
the set of real numbers with operations plus and maximum.

Technical Report UTEP-CS-11-32, June 2011

Towards a "Generic" Notion of Genericity: From "Typical" and "Random" to Meager, Shy, etc.

Ali Jalal-Kamali, Ondrej Nebesky, Michael H. Durcholz, Vladik Kreinovich, and Luc Longpre

Published in *Journal of Uncertain Systems*, 2012, Vol. 6,
No. 2, pp. 104-113.

In many application areas, it is important to study "generic" properties, i.e., properties which hold for ``typical'' examples. For example, if we know the probabilities of different events, we can consider a "random" object -- i.e., an object that, crudely speaking, does not belong to any class of "unusual" events (i.e., to any class with a small probability). In other cases, "typical" may mean not belonging to an "unusual" subset which is small in some other sense -- e.g., a subset of a smaller dimension. The corresponding notion of "typicalness" has been formalized for several cases, including the case of random events. In this case, the known Kolmogorov-Martin-Lof definition of randomness captures the idea that properties with probability 0 are impossible. In our previous papers, we modified this definition to take into account that from a practical viewpoint, properties with very small probabilities are often considered impossible as well. In this paper, we extend this definition to a general notion of "generic".

Technical Report UTEP-CS-11-31, June 2011

Revised version UTEP-CS-11-31b, September 2011

No-Free-Lunch Result for Interval and Fuzzy Computing: When Bounds Are Unusually Good, Their Computation is Unusually Slow

Martine Ceberio and Vladik Kreinovich

Published in: I. Batyrshin and G. Sidorov (eds.),
*Proceedings of the 10th Mexican International Conference on
Artificial Intelligence MICAI'2011*, Puebla, Mexico, November 26 -
December 4, 2011, Springer Lecture Notes in Artificial Intelligence,
Vol. 7095, pp. 13-23.

On several examples from interval and fuzzy computations and from related areas, we show that when the results of data processing are unusually good, their computation is unusually complex. This makes us think that there should be an analog of Heisenberg's uncertainty principle well known in quantum mechanics: when we an unusually beneficial situation in terms of results, it is not as perfect in terms of computations leading to these results. In short, nothing is perfect.

Original file in pdf

updated version in pdf

Technical Report UTEP-CS-11-30c, May 2012

Updated version UTEP-CS-11-30d, May 2012

Failure Analysis of a Complex System Based on Partial Information about Subsystems, with Potential Applications to Aircraft Maintenance

Vladik Kreinovich, Christelle Jacob, Didier Dubois, Janette Cardoso, and Martine Ceberio

Published in *Applied and Computational Mathematics*, 2012,
Vol. 11, No. 2, pp. 165-179.

In many real-life applications (e.g., in aircraft maintenance), we need to estimate the probability of failure of a complex system (such as an aircraft as a whole or one of its subsystems). Complex systems are usually built with redundancy allowing them to withstand the failure of a small number of components. In this paper, we assume that we know the structure of the system, and, as a result, for each possible set of failed components, we can tell whether this set will lead to a system failure. For each component $A$, we know the probability P(A) of its failure with some uncertainty: e.g., we know the lower and upper bounds for this probability. Usually, it is assumed that failures of different components are independent events. Our objective is to use all this information to estimate the probability of failure of the entire the complex system. In this paper, we describe several methods for solving this problem, including a new efficient method for such estimation based on Cauchy deviates.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-30, June 2011

Updated version UTEP-CS-11-30b, August 2011

Estimating Probability of Failure of a Complex System Based on Inexact Information about Subsystems and Components, with Potential Applications to Aircraft Maintenance

Vladik Kreinovich, Christelle Jacob, Didier Dubois, Janette Cardoso, Martine Ceberio, and Ildar Batyrshin

Published in: I. Batyrshin and G. Sidorov (eds.),
*Proceedings of the 10th Mexican International Conference on
Artificial Intelligence MICAI'2011*, Puebla, Mexico, November 26 -
December 4, 2011, Springer Lecture Notes in Artificial Intelligence,
Vol. 7095, pp. 70-81.

In many real-life applications (e.g., in aircraft maintenance), we need to estimate the probability of failure of a complex system (such as an aircraft as a whole or one of its subsystems). Complex systems are usually built with redundancy allowing them to withstand the failure of a small number of components. In this paper, we assume that we know the structure of the system, and, as a result, for each possible set of failed components, we can tell whether this set will lead to a system failure. For each component A, we know the probability P(A) of its failure with some uncertainty: e.g., we know the lower and upper bounds for this probability. Usually, it is assumed that failures of different components are independent events. Our objective is to use all this information to estimate the probability of failure of the entire the complex system. In this paper, we describe a new efficient method for such estimation based on Cauchy deviates.

Original file in pdf

updated version in pdf

Technical Report UTEP-CS-11-29, June 2011

Estimating Mean and Variance under Interval Uncertainty: Dynamic Case

Rafik Aliev and Vladik Kreinovich

Published in *Proceedings of the 2011 Sixth
International Conference on Soft Computing, Computing with
Words and Perceptions in System Analysis, Decision and Control
ICSCCW'2011*, Antalya, Turkey, September 1-2, 2011,
pp. 85-93.

In many practical situations, it is important to estimate the
mean E and the variance V from the sample values
x_{1}, ..., x_{n}. Usually, in statistics,
we consider the case
when the parameters like E and V do not change with time
and when the sample values x_{i} are known exactly. In
practice, the values x_{i}come from measurements, and
measurements are never 100% accurate. In many cases, we only
know the upper bound D_{i} on the measurement error. In
this case, once we know the measured value X_{i}, we
can conclude that the actual (unknown) value x_{i} belongs to
the interval [X_{i} - D_{i}, X_{i} +
D_{i}]. Different values x_{i} from these intervals
lead, in general, to different values of E and V. It is
therefore desirable to find the ranges [E] and [V]
of all possible values of E and V. While this problem is,
in general, NP-hard, in many practical situations, there exist
efficient algorithms for computing such ranges.

In practice, processes are dynamic. As a result, reasonable estimates for E and V assign more weight to more recent measurements and less weight to the past ones. In this paper, we extend known algorithms for computing the ranges [E] and [V] to such dynamic estimates.

Technical Report UTEP-CS-11-28, June 2011

Product of Partially Ordered Sets (Posets), with Potential Applications to Uncertainty Logic and Space-Time Geometry

Francisco Zapata, Olga Kosheleva, and Karen Villaverde

Published in *International Journal of Innovative Management,
Information & Production (IJIMIP)*, 2011, Vol. 2, No. 4, pp.
10-28.

One of the main objectives of science and engineering is to help people select the most beneficial decisions. To make these decisions,

- we must know people's preferences,
- we must have the information about different events -- possible consequences of different decisions, and
- since information is never absolutely accurate and precise, we must also have information about the degree of certainty.

- For preferences, a < b means that b is preferable to a. This relation is used in decision theory.
- For events, a < b means that a can influence b. This causality relation is used in space-time physics.
- For uncertain statements, a < b means that a is less certain than b. This relation is used in logics describing uncertainty such as fuzzy logic.

In this paper, we consider the general problem of how to combine two ordered spaces A and B into one. We also analyze which properties of orders are preserved under the resulting products.

Technical Report UTEP-CS-11-27a, June 2011

Revised version UTEP-CS-11-27c, February 2015

Dynamic Fuzzy Logic Leads to More Adequate "And" and "Or" Operations

Vladik Kreinovich

Short version published in *Proceedings of the 2011 Sixth
International Conference on Soft Computing, Computing with
Words and Perceptions in System Analysis, Decision and Control
ICSCCW'2011*, Antalya, Turkey, September 1-2, 2011,
pp. 21-29; extended version published in *International Journal
of Knowledge Engineering and Soft Data Paradigms*, 2014,
Vol. 4, No. 4, pp. 327-338.

In the traditional (static) fuzzy logic approach, we select an "and"-operation (t-norm) and an "or"-operation (t-conorm). The result of applying these selected operations may be somewhat different from the actual expert's degrees of belief in the corresponding logical combinations A & B and A \/ B of the original statements -- since these degrees depend not only on the expert's degrees of belief in statements A and B, but also in the extent to which the statements A and B are dependent. We show that dynamic fuzzy logic enables us to automatically take this dependence into account -- and thus, leads to more adequate "and"- and "or"-operations.

Original file in pdf

Revised version in pdf

Technical Report UTEP-CS-11-26, May 2011

Functional Verification of Class Invariants in CleanJava

Carmen Avila and Yoonsik Cheon

In a Cleanroom-style functional program verification, a program is viewed as a mathematical function from one program state to another, and a verification is done by comparing two functions, the implemented and the expected behaviors of a program. The technique requires a minimal mathematical background and supports forward reasoning, but it doesn't support assertions such as class invariants. However, assertions such as class invariants are not only a practical programming tool but also play a key role in the correctness proof of a program by specifying conditions and constraints that an object has to satisfy and thus defining valid states of an object. We suggest a way to integrate the notion of a class invariant in a functional program verification by using CleanJava as a specification notation and a verification framework. CleanJava is a formal annotation language for Java to support a Cleanroom-style functional program verification. We propose a small extension to CleanJava to specify a class invariant and also to its proof logic to verify the invariant. Our extension closely reflects the way programmers specify and reason about the correctness of a program informally. It allows one to use a class invariant in the framework of a Cleanroom-style functional specification and verification.

Technical Report UTEP-CS-11-25, May 2011

Updated version UTEP-CS-11-25a, October 2011

Towards Interval Techniques for Model Validation

Jaime Nava and Vladik Kreinovich

Published in *Computing*, 2012, Vol. 94,
No. 2, pp. 257-269.

Most physical models are approximate. It is therefore important to
find out how accurate are the predictions of a given model. This can be done by
*validating* the model, i.e., by comparing its predictions with the experimental data.
In some practical situations, it is difficult to directly compare the predictions
with the experimental data, since models usually contain (physically meaningful) parameters, and the
exact values of these parameters are often not known. One way to overcome this difficulty is to
get a statistical distribution of the corresponding parameters. Once we substitute these distributions
into a model, we get statistical predictions -- and we can compare the resulting probability distribution
with the actual distribution of measurement results. In this approach, we combine all the measurement results,
and thus, we are ignoring the information that some of these results correspond to the same
values of the parameters -- e.g., they come from measuring the same specimen under different conditions.
In this paper, we propose an interval approach that takes into account this important information.
This approach is illustrated on the example of a benchmark thermal problem presented at the Sandia Validation
Challenge Workshop (Albuquerque, New Mexico, May 2006).

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-24, May 2011

Updated version UTEP-CS-11-24a, April 2012

When Is Busemann Product a Lattice? A Relation Between Metric Spaces and Corresponding Space-Time Models

Hans-Peter Kuenzi, Francisco Zapata, and Vladik Kreinovich

Published in *Applied Mathematical Sciences*, 2012, Vol. 6,
No. 66, pp. 3267-3276.

The causality relation of special relativity is based on the assumption that the speed of all physical processes is limited by the speed of light. As a result, an event (t,x) occurring at moment t at location x can influence an event (y,s) if and only if s >= t+d(x,y)/c. We can simplify this formula if we use units of time and distance in which c=1 (e.g., by using a light second as a unit of distance). In this case, the above causality relation takes the form s >= t+d(x,y). Since the actual space can be non-Euclidean, H. Busemann generalized this ordering relation to the case when points x, y, etc. are taken from an arbitrary metric space X. A natural question is: when is the resulting ordered space -- called a Busemann product -- a lattice? In this paper, we provide a necessary and sufficient condition for it being a lattice: it is a lattice if and only if X is a real tree, i.e., a metric space in which every two points are connected by exactly one arc, and this arc is geodesic (i.e., metrically isomorphic to an interval on a real line).

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-23, May 2011

Estimating Probability of Failure of a Complex System Based on Partial Information about Subsystems and Components, with Potential Applications to Aircraft Maintenance

Christelle Jacob, Didier Dubois, Janette Cardoso, Martine Ceberio, and Vladik Kreinovich

Published in *Proceedings of the
International Workshop on Soft Computing Applications and
Knowledge Discovery SCAKD'2011*, Moscow, Russia, June 25, 2011,
pp. 30-41.

In many real-life applications (e.g., in aircraft maintenance), we need to estimate the probability of failure of a complex system (such as an aircraft as a whole or one of its subsystems). Complex systems are usually built with redundancy allowing them to withstand the failure of a small number of components. In this paper, we assume that we know the structure of the system, and, as a result, for each possible set of failed components, we can tell whether this set will lead to a system failure. In some cases, for each component A, we know the probability P(A) of its failure; in other cases, however, we only know the lower and upper bounds for this probability. Sometimes, we only have expert estimates for these probabilities, estimates that can be described as fuzzy numbers.

Usually, it is assumed that failures of different components are independent events, but sometimes, we know that they are dependent -- in which case we usually do not have any specific information about their correlation. Our objective is to use all this information to estimate the probability of failure of the entire the complex system. In this paper, we describe methods for such estimation.

Technical Report UTEP-CS-11-22, May 2011

Updated version UTEP-CS-11-22a, August 2011

How to Tell When a Product of Two Partially Ordered Spaces Has a Certain Property?

Francisco Zapata, Olga Kosheleva, and Karen Villaverde

Published in *Journal of Uncertain Systems*, 2012, Vol. 6,
No. 2, pp. 152-160.

In this paper, we describe how checking whether a given
property F is true for a product A_{1} X A_{2} of partially
ordered spaces can be reduced to checking several related
properties of the original spaces A_{i}.

This result can be useful in the analysis of properties
of intervals [a,b] = {x: a <= x <= b}
over general partially ordered spaces -- such as the space
of all vectors with component-wise order or the set of all
functions with component-wise ordering f <= g <-->
for all x (f(x) <= g(x)). When we consider sets of pairs of
such objects A_{1} X A_{2}, it is natural to define the order
on this set in terms of orders in A_{1} and A_{2} -- this is, e.g.,
how ordering and intervals are defined on the set R^{2} of all
2-D vectors.

This result can also be useful in the analysis of ordered spaces describing different degrees of certainty in expert knowledge.

Original file in pdf

Revised version in pdf

Technical Report UTEP-CS-11-21, May 2011

Revised version UTEP-CS-11-21a, June 2011

Why Fuzzy Transform Is Efficient in Large-Scale Prediction Problems: A Theoretical Explanation

Irina Perfilieva and Vladik Kreinovich

Published in *Advances in Fuzzy Systems*, Vol. 2011,
Article ID 985839, doi:10.1155/2011/985839.

In many practical situations like weather prediction, we are interested in large-scale (averaged) value of the predicted quantities. For example, it is impossible to predict the exact future temperature at different spatial locations, but we can reasonably well predict average temperature over a region. Traditionally, to obtain such large-scale predictions, we first perform a detailed integration of the corresponding differential equation, and then average the resulting detailed solution. This procedure is often very time-consuming, since we need to process all the details of the original data.

In our previous papers, we have shown that similar quality large-scale prediction results can be obtained if instead, we apply a much faster procedure -- first average the inputs (by applying an appropriate fuzzy transform), and then use these averaged inputs to solve the corresponding (discretization of the) differential equation.

In this paper, we provide a general theoretical explanation of why our semi-heuristic method works, i.e., why fuzzy transforms are efficient in large-scale predictions.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-20, May 2011

Towards Optimal Knowledge Processing: From Centralization Through Cyberinsfrastructure To Cloud Computing

Octavio Lerma, Eric Gutierrez, Chris Kiekintveld, and Vladik Kreinovich

Published in *International Journal of Innovative Management,
Information & Production (IJIMIP)*, 2011, Vol. 2, No. 2,
pp. 67-72.

One of the most efficient way to store and process data is
*cloud computing*, when we store the data so as to
minimize the expenses and increase the efficiency. In this
paper, we provide an analytical solution to the corresponding
optimization problem.

Technical Report UTEP-CS-11-19, May 2011

Joggler: Data Harvest And Analysis Tool

Ondrej Nebesky

Increase in use of stand-alone systems for recording data has made data harvesting across several fields easier and faster. However, raw data collected from such systems need to be manipulated and processed to enable meaningful analysis. Although data are readily available, one major issue concerning analysts and scientists is collation of data from various sources. Without a standard data format, scientists and analysts are required to put in resources to bring in data from multiple sources together. The problem is aggravated when a particular data source changes its data format.

This project introduces the Joggler framework for a collection of services and inter- faces to enable analysts to quickly process data and bring them to other application with minimum of coding. The Joggler framework has been demonstrated on various projects, including weather data visualization and data visualization from the Jornada experimental site.

Technical Report UTEP-CS-11-18, May 2011

Uniqueness of Reconstruction for Yager's t-Norm Combination of Probabilistic and Possibilistic Knowledge

Nitaya Buntao and Vladik Kreinovich

Published in *International Journal of Intelligent Systems*,
2012, Vol. 27, No. 1, pp. 16-22.

Often, about the same real-life system, we have both measurement-related probabilistic information expressed by a probability measure P(S) and expert-related possibilistic information expressed by a possibility measure M(S). To get the most adequate idea about the system, we must combine these two pieces of information. For this combination, R. Yager -- borrowing an idea from fuzzy logic -- proposed to use a t-norm f(a,b)$ such as the product f(a,b)=a*b, i.e., to consider a set function f(S)=f(P(S),M(S)). A natural question is: can we uniquely reconstruct the two parts of knowledge from this function f(S)? In our previous paper, we showed that such a unique reconstruction is possible for the product t-norm; in this paper, we extend this result to a general class of t-norms.

Technical Report UTEP-CS-11-17, April 2011

Revised version UTEP-CS-11-17a, May 2011

Linear-Time Resource Allocation in Security Games with Identical Fully Protective Resources

Octavio Lerma, Vladik Kreinovich, and Christopher Kiekintveld

Published in *Proceedings of the AAAI
Workshop on Applied Adversarial Reasoning and Risk Modeling
AARM'11*, San Francisco, California, August 7, 2011

Game theory has become an important tools for making resource allocations decision in security domains, including critical infrastructure protection. Many of these games are formulated as Stackelberg security games. We present new analysis and algorithms for a class of Stackelberg security games with identical, fully protective defender resources. The first algorithm has worst-case complexity linear in the number of possible targets, but works only for a restricted case. The second algorithm can find and optimal resource allocation for the general case in time O(n log(n)).

Original file in pdf

updated version in pdf

Technical Report UTEP-CS-11-16, April 2011

Revised version UTEP-CS-11-16a, July 2011

Final version UTEP-CS-11-16b, August 2011

Reconstructing an Open Order from Its Closure, with Applications to Space-Time Physics and to Logic

Francisco Zapata and Vladik Kreinovich

Published in *Studia Logica*, 2012, Vol. 100,
No. 1-2, pp. 419-435.

In his logical papers, Leo Esakia studied corresponding ordered topological spaces and order-preserving mappings. Similar spaces and mappings appear in many other application areas such the analysis of causality in space-time. It is known that under reasonable conditions, both the topology and the original order relation can be uniquely reconstructed if we know the "interior" < of the order relation. It is also known that in some cases, we can uniquely reconstruct < (and hence, topology) from the original order relation. In this paper, we show that, in general, under reasonable conditions, the open order < (and hence, the corresponding topology) can be uniquely determined from its closure.

Original file in pdf

Revised version in pdf

Final version in pdf

Technical Report UTEP-CS-11-15, March 2011

Knowledge Annotations in Scientific Workflows: An Implementation in Kepler

Aida Gandara, George Chin Jr., Paulo Pinheiro da Silva, Signe White, Chandrika Sivaramakrishnan, and Terence Critchlow

Scientific research products are the result of long-term collaborations between teams. Scientific workflows are capable of helping scientists in many ways including collecting information about how research was conducted (e.g., scientific workflow tools often collect and manage information about datasets used and data transformations). However, knowledge about why data was collected is rarely documented in scientific workflows. In this paper we describe a prototype system built to support the collection of scientific expertise that influences scientific analysis. Through evaluating a scientific research effort underway at the Pacific Northwest National Laboratory, we identified features that would most benefit PNNL scientists in documenting how and why they conduct their research, making this information available to the entire team. The prototype system was built by enhancing the Kepler Scientific Workflow System to create knowledge-annotated scientific workflows and to publish them as semantic annotations.

Technical Report UTEP-CS-11-14, March 2011

Updated version UTEP-CS-11-14a, June 2011

Processing Interval Sensor Data in the Presence of Outliers, with Potential Applications to Localizing Underwater Robots

Jan Sliwka, Luc Jaulin, Martine Ceberio, and Vladik Kreinovich

Published in *Proceedings of the 2011 IEEE International Conference on
Systems, Man, and Cybernetics SMC'2011*, Anchorage, Alaska,
October 9-12, 2011, pp. 2330-2337.

Measurements are never absolutely accurate, the measurement result X is, in general, different from the actual (unknown) values x of the corresponding quantity. In many practical problems, we only know upper bounds D on the measurement errors d = X - x. In such situations, once we know the measurement result, the only conclusion that we can make about the actual value x is that this value belongs to the interval [X - D, X + D]. There exist many efficient algorithms for processing such interval data. However, these algorithms usually assume that all the measurement results are valid. In reality, due to factors such as sensor malfunction, some measurement results may be way off (outliers), for which the difference between X and x is much larger than the upper bound D on the measurement error. In this paper, we overview the algorithmic problems related to processing interval sensor data in the presence of outliers. Our case study -- for which we develop and analyze these algorithms -- is localization of underwater robots, a problem in which a significant number of measurement results are outliers.

Original file in pdf

Updated version in pdf

Technical Report UTEP-CS-11-13, March 2011

How to Combine Probabilistic and Possibilistic (Expert) Knowledge: Uniqueness of Reconstruction in Yager's (Product) Approach

Nitaya Buntao and Vladik Kreinovich

Published in *International Journal of Innovative Management,
Information & Production (IJIMIP)*, 2011, Vol. 2, No. 1, pp. 1-8.

Often, about the same real-life system, we have both measurement-related probabilistic information expressed by a probability measure P(S) and expert-related possibilistic information expressed by a possibility measure M(S). To get the most adequate idea about the system, we must combine these two pieces of information. For this combination, R. Yager -- borrowing an idea from fuzzy logic -- proposed to use the simple product t-norm, i.e., to consider a set function f(S) = P(S) * M(S). A natural question is: can we uniquely reconstruct the two parts of knowledge from this function f(S)? In this paper, we prove that while in the discrete case, the reconstruction is often not unique, in the continuous case, we can always uniquely reconstruct both components P(S) and M(S) from the combined function f(S). In this sense, Yager's combination is indeed an adequate way to combine the two parts of knowledge.

Technical Report UTEP-CS-11-12, March 2011

Estimating Risk of Extreme and Catastrophic Events under Interval Uncertainty

Nitaya Buntao and Vladik Kreinovich

In many application areas, we encounter heavy-tail
distributions -- for example, such distributions are ubiquitous
in financial applications. These distributions are often
described by Pareto law. There exist techniques for estimating
the parameters of such the corresponding Pareto distributions
based on the sample x_{1}, ..., x_{n}. In practice, we often
only know the values x_{i} with interval uncertainty. In this
paper, we show how to estimate the parameters of the Pareto
distribution under such uncertainty and how to describe deviation
and dependence for general heavy-tailed distributions.

Technical Report UTEP-CS-11-11, March 2011

Estimating Covariance for Privacy Case under Interval (and Fuzzy) Uncertainty

Ali Jalal-Kamali, Vladik Kreinovich, and Luc Longpre

Published in: Ronald R. Yager, Marek Z. Reformat, Shahnaz N. Shahbazova, and
Sergei Ovchinnikov (eds.), *Proceedings of the World Conference on Soft
Computing*, San Francisco, CA,
May 23-26, 2011.

One of the main objectives of collecting data in statistical databases (medical databases, census databases) is to find important correlations between different quantities. To enable researchers to looks for such correlations, we should allow them them to ask queries testing different combinations of such quantities. However, when we receive answers to many such questions, we may inadvertently disclose information about individual patients, information that should be private.

One way to preserve privacy in statistical databases is to store {\it ranges} instead of the original values. For example, instead of an exact age of a patient in a medical database, we only store the information that this age is, e.g., between 60 and 70. This idea solves the privacy problem, but it make statistical analysis more complex. Different possible values from the corresponding ranges lead, in general, to different values of the corresponding statistical characteristic; it is therefore desirable to find the range of all such values.

It is known that for mean and variance, there exist feasible algorithms for computing such ranges. In this paper, we show that similar algorithms are possible for another important statistical characteristic -- covariance, whose value is important in computing correlations.

Technical Report UTEP-CS-11-10, February 2011

From Processing Interval-Valued Fuzzy Data to General Type-2: Towards Fast Algorithms

Vladik Kreinovich

Published in *Proceedings of the
IEEE Symposium on Advances in Type-2 Fuzzy Logic Systems
T2FUZZ'2011*, part of the IEEE Symposium Series on Computational
Intelligence, Paris, France, April 11-15, 2011, pp. ix-xii.

It is known that processing of data under general type-1 fuzzy uncertainty can be reduced to the simplest case -- of interval uncertainty: namely, Zadeh's extension principle is equivalent to level-by-level interval computations applied to alpha-cuts of the corresponding fuzzy numbers.

However, type-1 fuzzy numbers may not be the most adequate way of describing uncertainty, because they require that an expert can describe his or her degree of confidence in a statement by an exact value. In practice, it is more reasonable to expect that the expert estimates his or her degree by using imprecise words from natural language -- which can be naturally formalized as fuzzy sets. The resulting type-2 fuzzy numbers more adequately represent the expert's opinions, but their practical use is limited by the seeming computational complexity of their use. It turns out that for the practically important case of interval-valued fuzzy sets, processing such sets can also be reduced to interval computations -- and that this idea can be naturally extended to arbitrary type-2 fuzzy numbers.

Technical Report UTEP-CS-11-09, February 2011

Revised version UTEP-CS-11-09b, March 2011

Towards Faster Estimation of Statistics and ODEs Under Interval, P-Box, and Fuzzy Uncertainty: From Interval Computations to Rough Set-Related Computations

Vladik Kreinovich

Published in In: Sergey O. Kuznetsov et al. (Eds.)
*Proceedings of the Thirteenth International Conference on
Rough Sets, Fuzzy Sets and Granular Computing RSFDGrC'2011
(Moscow, Russia, June 25-27, 2011),* Springer Lecture Notes on
Artificial Intelligence LNAI, Springer-Verlag, Berlin,
Heidelberg, 2011, Vol. 6743, pp. 3-10.

*Proceedings of the Thirteenth International
Conference on Rough Sets, Fuzzy Sets and Granular Computing
RSFDGrC'2011*, Moscow, Russia, June 25-27, 2011.

Interval computations estimate the uncertainty of the result of data processing in situations in which we only know the upper bounds $\Delta$ on the measurement errors. In interval computations, at each intermediate stage of the computation, we have intervals of possible values of the corresponding quantities. As a result, we often have bounds with excess width. One way to remedy this problem is to extend interval technique to rough-set computations, where on each stage, in addition to intervals of possible values of the quantities, we also keep rough sets representing possible values of pairs (triples, etc.).

Original file in pdf

Revised version in pdf

Technical Report UTEP-CS-11-08, January 2011

Towards a General Description of Translation-Invariant and Translation-Covariant Linear Transformations: A Natural Justification of Fourier Transforms and Fuzzy Transforms

Irina Perfilieva and Vladik Kreinovich

Published in
*Proceedings of the World Congress of the International Fuzzy
Systems Association IFSA'2011*, Surabaya and Bali Island,
Indonesia, June 21-25, 2011.

In many practical situations, we are interested in the dependencies that do not change with time, i.e., that do not change when we change the origin of the time axis. The corresponding translation-invariant transformations are easy to describe: they correspond to convolutions, or, equivalently, to fuzzy transforms.

It turns out that if we relax the invariance condition and
require only that the transformation be
translation-*covariant* (i.e., that it appropriately changes under
translation), we get exactly two classes of transformations:
Fourier transforms and fuzzy transforms. This result explain
why both transforms have been successfully used in data
processing.

Technical Report UTEP-CS-11-07, January 2011

Measures of Deviation (and Dependence) for Heavy-Tailed Distributions and heir Estimation under Interval and Fuzzy Uncertainty

Nitaya Buntao and Vladik Kreinovich

Published in: Ronald R. Yager, Marek Z. Reformat, Shahnaz N. Shahbazova, and
Sergei Ovchinnikov (eds.), *Proceedings of the World Conference on Soft
Computing*, San Francisco, CA,
May 23-26, 2011.

Traditionally, in science and engineering, most statistical techniques are
based on the assumption that the random variables are normally distributed.
For such distributions, a natural characteristic of the "average" value is the mean,
and a natural characteristic of the deviation from the average is the variance.
However,
in many practical situations, e.g., in economics and finance, we encounter
probability distributions for which the variance is infinite; such distributions are
called *heavy-tailed*. For such distributions, we describe which characteristics
can be used to describe the average and the deviation from the average, and
how to estimate these characteristics under interval and fuzzy uncertainty.
We also discuss what are the reasonable analogues of correlation for such
heavy-tailed distributions.

Technical Report UTEP-CS-11-06, January 2011

PWiseGen: Generating Test Cases for Pairwise Testing Using Genetic Algorithms

Pedro Flores and Yoonsik Cheon

Pairwise testing is a combinatorial testing technique that tests all possible pairs of input values. Although, finding a smallest set of test cases for pairwise testing is NP-complete, pairwise testing is regarded as a reasonable cost-benefit compromise among combinatorial testing methods. In this paper we formulate the problem of finding a pairwise test set as a search problem and apply a genetic algorithm to solve it. We also describe an open-source tool called PWiseGen for generating pairwise test sets. According to our experiments, PWiseGen produces competitive results compared with existing pairwise testing tools. Besides, it provides a framework and a research platform for generating pairwise test sets using genetic algorithms; it is configurable, extensible, and reusable.

File in PDF and in Compressed Postscript

Technical Report UTEP-CS-11-05, January 2011

Towards Optimal Few-Parametric Representation of Spatial Variation: Geometric Approach and Environmental Applications

Misha Koshelev, Octavio Lerma, and Craig Tweedie

Published in *Geombinatorics*, 2011, Vol. 21, No. 1, pp. 15-24.

In this paper, we use geometric approach to show
that under reasonable assumption, the spatial
variability of a field f(x), i.e., the expected
value F(z)=E[(f(x+z)-f(x))^{2}], has the form
F(z)=|Σ g_{ij}*z_{i}*z_{j}|^{α}.
We explain how to find g_{ij} and α
from the observations, and how to optimally place sensors
in view of this spatial variability.

Technical Report UTEP-CS-11-04, January 2011

Updated version UTEP-CS-11-04a, March 2011

Modified Fourier-Motzkin Elimination Algorithm for Reducing Systems of Linear Inequalities with Unconstrained Parameters

Mario Bencomo, Luis Gutierrez, and Martine Ceberio

The need for eliminating redundancies in systems of linear inequalities arises in many applications. In linear programming (LP), one seeks a solution that optimizes a given linear objective function subject to a set of linear constraints, sometimes posed as linear inequalities. Linear inequalities also arise in the context of tensor decomposition. Due to the lack of uniqueness in higher-order tensor decomposition, non-negativity constraints are imposed on the decomposition factors, yielding systems of linear inequalities. Eliminating redundancies in such systems can reduce the number of computations, and hence improve computation times in applications.

Current techniques for eliminating redundant inequalities are not viable in higher dimensions. As an alternative we propose a modified version of the Fourier-Motzkin Elimination Algorithm (ModFMEA), implemented in MatLab, to reduce redundancies in a given system of linear constraints over reals posed as linear inequalities. Reduction is obtained, at each orthant containing the solution set, by taking the lower and upper bounds of x_i-normalized inequalities x_i >= l and u >= x_i respectively, where l and u are linear terms with no occurrence of x_i, for i = 1,2,...,N. The reduced system over the whole solution set can be obtained by taking the union of the reduced system at each orthant.

This method eliminates redundancies by retaining a system of linear inequalities that define the set of feasible solutions. It works under the assumption that all of the variables are unconstrained, i.e., variables may take on negative and positive values. Experimental results demonstrate reduction of systems in higher dimensions for both bounded and unbounded solution sets with feasible computational times, and provide important hindsight into its workings that allows us to design an extension of ModFMEA (ExModFMEA) that yields even more reduction.

Original file in pdf

updated version in pdf

Technical Report UTEP-CS-11-03, January 2011

Updated version UTEP-CS-11-03a, March 2011

Optimizing Trajectories for Unmanned Aerial Vehicles (UAVs) Patrolling the Border

Chris Kiekintveld, Vladik Kreinovich, and Octavio Lerma

Published in: Ronald R. Yager, Marek Z. Reformat, Shahnaz N. Shahbazova, and
Sergei Ovchinnikov (eds.), *Proceedings of the World Conference on Soft
Computing*, San Francisco, CA,
May 23-26, 2011.

At first glance, most aspects of border protection activity look like classical examples of zero-sum games, in which the interests of the two sides are exactly opposite. This is how such situations are planned now: this is how border patrol agents are assigned to different segments of the border, this is how routes of coast guard ships are planned, etc. However, there is a big difference between such situations and the traditional zero-sum games: in the traditional zero-sum games, it is assumed that we know the exact objective function of each participant; in contrast, in border protection planning (e.g., in counter-terrorism planning), the adversary's objective function is rarely known in precise terms; at best, we have the description of this objective function in terms of words from natural language. In this paper, on an example of an UAV patrolling the border, we show how fuzzy techniques can help in planning border protection strategies under such uncertainty.

Original file in pdf

updated version in pdf

Technical Report UTEP-CS-11-02, January 2011

Designing, understanding, and analyzing unconventional computation: the important role of logic and constructive mathematics

Vladik Kreinovich

Published in *Applied Mathematical Sciences*, 2012, Vol.
6, No. 13, pp. 629-644.

In this paper, we explain why, in our opinion, logic and constructive mathematics are playing -- and should play -- an important role in the design, understanding, and analysis of unconventional computation.

Technical Report UTEP-CS-11-01, January 2011

Revised version UTEP-CS-11-01a, August 2011

Computation in quantum space-time could lead to a super-polynomial speedup

Michael Zakharevich and Vladik Kreinovich

Published in *Journal of Uncertain Systems*, 2012, Vol. 6,
No. 2, pp. 146-151.

In theoretical computer science, researchers usually distinguish between feasible problems (that can be solved in polynomial time) and problems that require more computation time. A natural question is: can we use new physical processes, processes that have not been used in modern computers, to make computations drastically faster -- e.g., to make intractable problems feasible? Such a possibility would occur if a physical process provides a super-polynomial (= faster than polynomial) speed-up.

In this direction, the most active research is undertaken in quantum computing. It is well known that quantum processes can drastically speed up computations; however, there are no proven super-polynomial quantum speedups of the overall computation time.

Parallelization is another potential source of speedup. In Euclidean space, parallelization only leads to a polynomial speedup. We show that in quantum space-time, parallelization could potentially lead to (provably) super-polynomial speedup of computations.

Original file in pdf

Revised version in pdf