## University of Texas at El Paso Computer Science Department Abstracts of 2011 Reports

Technical Report UTEP-CS-11-65, December 2011
Efficient Approximation for Security Games with Interval Uncertainty

Published in Proceedings of the AAAI Spring Symposium on Game Theory for Security, Sustainability, and Health GTSSH'2012, Stanford, March 26-28, 2012.

There are an increasing number of applications of security games. One of the key challenges for this field going forward is to address the problem of model uncertainty and the robustness of the game-theoretic solutions. Most existing methods for dealing with payoff uncertainty are Bayesian methods which are NP-hard and have difficulty scaling to very large problems. In this work we consider an alternative approach based on interval uncertainty. For a variant of security games with interval uncertainty we introduce a polynomial-time approximation algorithm that can compute very accurate solutions within a given error bound.

Technical Report UTEP-CS-11-64, December 2011
Updated version UTEP-11-64a, March 2012
Assessment of Functional Impairment in Human Locomotion: A Fuzzy-Motivated Approach

Published in Proceedings of the 2012 Annual Conference of the North American Fuzzy Information Processing Society NAFIPS'2012, Berkeley, California, August 6-8, 2012.

Many neurological disorders result in disordered motion. The effects of a disorder can be decrease by an appropriate rehabilitation. To make rehabilitation efficient, we need to monitor the patient and check how well he or she improves. In our previous papers, we proposed a fuzzy-based semi-heuristic method of gauging how well a patient improved. Surprisingly, this semi-heuristic method turned out to be more efficient that we expected. In this paper, we provide a justification for this efficiency.

In the future, it is desirable to combine this fuzzy-assessment approach with results by Alavarez-Alvarez, Trivino, and Cordon who use fuzzy techniques for modeling human gait.

Original file in pdf
updated version in pdf

Technical Report UTEP-CS-11-63, December 2011
Constraint Optimization: From Efficient Computation of What Can Be Achieved to Efficient Computation of a Way to Achieve The Corresponding Optimum
Ali Jalal-Kamali, Martine Ceberio, and Vladik Kreinovich

To appear in Proceedings of the Fifth International Workshop on Constraint Programming and Decision Making CoProD'12, Novosibirsk, Russia, September 23, 2012.

In many practically useful cases, we know how to efficiently compute the exact range of a function over given intervals (and, possibly, under additional constraints). In other words, we know how to efficiently compute the minimum and maximum of a given function f(x1, ..., xn) on any box. From the practical viewpoint, it is important not only to find the value of the corresponding maximum or minimum, but also to know for what values of the parameters xi this optimum is attained. We prove a general result: that if we can efficiently compute the optimum, then we can also efficiently find the values at which this optimum is attained.

Technical Report UTEP-CS-11-62, December 2011
Updated version UTEP-CS-11-62a, April 2011
Why Bernstein Polynomials Are Better: Fuzzy-Inspired Justification
Jaime Nava, Olga Kosheleva, and Vladik Kreinovich

Published in Proceedings of the 2012 IEEE World Congress on Computational Intelligence WCCI'2012, Brisbane, Australia, June 10-15, 2012, pp. 1986-1991.

It is well known that an arbitrary continuous function on a bounded set -- e.g., on an interval [a,b] -- can be, with any given accuracy, approximated by a polynomial. Usually, polynomials are described as linear combinations of monomials. It turns out that in many computational problems, it is more efficient to represent a polynomial as Bernstein polynomials -- e.g., for functions of one variable, a linear combination of terms (x-a)k * (b-x)n-k. In this paper, we provide a simple fuzzy-based explanation of why Bernstein polynomials are often more efficient, and we show how this informal explanation can be transformed into a precise mathematical explanation.

Original file in pdf
updated version in pdf

Technical Report UTEP-CS-11-61, December 2011
Updated version UTEP-CS-11-61a, April 2012
Semi-Heuristic Poverty Measures Used by Economists: Justification Motivated by Fuzzy Techniques
Karen Villaverde, Nagwa Albehery, Tonghui Wang, and Vladik Kreinovich

Published in Proceedings of the 2012 IEEE World Congress on Computational Intelligence WCCI'2012, Brisbane, Australia, June 10-15, 2012, pp. 1603-1609.

To properly gauge the extent of poverty in a country or in a region, economists use semi-heuristic poverty measures such as the Foster-Greer-Thorbecke (FGT) metric. These measures are used because it was empirically shown that they capture the commonsense meaning of the extent of poverty better than previously proposed measures. However, the fact that these measures are better then a few earlier proposed ones does not necessarily mean that these measures are the best possible; so, it is desirable to look for optimal poverty measures. In this paper, we first use fuzzy techniques to provide a commonsense interpretation of FGT poverty measures, and then show that under certain reasonable conditions, these measures are indeed optimal.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-60, December 2011
Revised version UTEP-CS-11-60a, March 2012
Towards Formalizing Non-Monotonic Reasoning in Physics: Logical Approach Based on Physical Induction and Its Relation to Kolmogorov Complexity

Published in: Esra Erdem, Joohyung Lee, Yuliya Lierler, and David Pearce (eds.), Correct Reasoning: Essays on Logic-Based AI in Honor of Vladimir Lifschitz, Springer Lecture Notes on Computer Science, 2012, Vol. 7265, pp. 390-404.

To formalize some types of non-monotonic reasoning in physics, researchers have proposed an approach based on Kolmogorov complexity. Inspired by Vladimir Lifschitz's belief that many features of reasoning can be described on a purely logical level, we show that an equivalent formalization can be described in purely logical terms: namely, in terms of physical induction.

One of the consequences of this formalization is that the set of not-abnormal states is (pre-)compact. We can therefore use Lifschitz's result that when there is only one state that satisfies a given equation (or system of equations), then we can algorithmically find this state. In this paper, we show that this result can be extended to the case of approximate uniqueness.

Original file in pdf
updated file in pdf

Technical Report UTEP-CS-11-59, December 2011
In the Beginning Was the Word, and the Word was Fuzzy

Published in: Rudolf Seising, Enric Trillas, Claudio Moraga, and Settimo Termini (eds.), On Fuzziness. A Homage to Lotfi A. Zadeh, Springer Verlag, Berlin, New York, 2013, pp. 337-341.

Technical Report UTEP-CS-11-58, November 2011
VisKo: Semantic Web Support for Information and Science Visualization
Nicholas Del Rio and Paulo Pinheiro da Silva

Specialized knowledge in visualization software packages such as Visualization Toolkit (VTK), Generic Mapping Tools (GMT), and NCAR Command Language (NCL) is almost always a requisite for writ- ing applications that visualize datasets or information. Technical under- standing of visualization packages including rendering capabilities and data format ingestion is needed before it can be determined whether a package can satisfy some set of visualization requirements. Even after identification of satisfactory visualization packages, an application must still be built on packages that generate the required visualizations. Visualization Knowledge (VisKo) modularized ontology set encodes knowledge about visualization software packages using semantic Web technologies (e.g., RDF, OWL, and Pellet) in order to facilitate the scientist task of visualizing data and information. In the presence of VisKo, users may move away from writing visualization applications and move towards an environment where they can request for visualizations declaratively and without being fully aware of a wide range of visualization package implementation details.

The use of VisKo ontologies is supported by a fully implemented environment composed of a query language, a visualization knowledge base, and a set of visualization services. The VisKo ontology and supporting environment has been in use by many scientific initiatives including visualizing artifacts associated with a seismic tomography, gravity anomalies, and brightness temperature.

Technical Report UTEP-CS-11-57, November 2011
Revised version UTEP-CS-11-57b, November 2012
Estimating Correlation under Interval Uncertainty

Published in Mechanical Systems and Signal Processing, 2013, Vol. 37, pp. 43-53.

In many engineering situations, we are interested in finding the correlation ρ between different quantities x and y based on the values xi and yi of these quantities measured in different situations i. Measurements are never absolutely accurate; it is therefore necessary to take this inaccuracy into account when estimating the correlation ρ. Sometimes, we know the probabilities of different values of measurement errors, but in many cases, we only know the upper bounds Δxi and Δyi on the corresponding measurement errors. In such situations, after we get the measurement results Xi and Yi, the only information that we have about the actual (unknown) values xi and yi is that they belong to the corresponding intervals [Xi - Δxi, Xi + Δxi] and [Yi - Δyi, Yi + Δyi]. Different values from these intervals lead, in general, to different values of the correlation ρ. It is therefore desirable to find the range [ρ-, ρ+] of possible values of the correlation when xi and yi take values from the corresponding intervals. In general, the problem of computing this range is NP-hard. In this paper, we provide a feasible (= polynomial-time) algorithm for computing at least one of the endpoints of this interval: for computing ρ+ when ρ+ > 0 and for computing ρ- when ρ- < 0.

Original file in pdf
Revised version in pdf

Technical Report UTEP-CS-11-56, November 2011
Constraint Problems: Computability Is Equivalent to Continuity

To appear in International Journal of Intelligent Technologies and Applied Statistics (IJITAS)

In many practical situations, we would like to compute the set of all possible values that satisfy given constraints. It is known that even for computable (constructive) constraints, computing such set is not always algorithmically possible. One reason for this algorithmic impossibility is that sometimes, the dependence of the desired set on the parameters of the problem is not continuous, while all computable functions of real variables are continuous. In this paper, we show that this discontinuity is the only case when the desired set cannot be computed. Specifically, we provide an algorithm that computes such a set for all the cases when the dependence is continuous.

Technical Report UTEP-CS-11-55, October 2011
Propagating Range (Uncertainty) and Continuity Information Through Computations: From Real-Valued Intervals to General Sets

One of the main problems of interval computations is to find an enclosure Y that contains the range f(X1, ..., Xn) of a given function f(x1, ..., xn) over given intervals X1, ..., Xn. Most of the techniques for estimating this range are based on propagating the range through computations. Specifically, we follow the computations of f(x1, ..., xn) step-by-step: we start with ranges X1, ..., Xn of the inputs, and then we sequentially compute the enclosures for the ranges of all intermediate results, until, on the last computation step, we get the desired enclosure Y.

A similar propagation of "decorations" -- information about continuity -- enables us to make conclusions about the continuity of the resulting function f(x1, ..., xn). In this paper, we show that the interval propagation results can be naturally extended to the general case of arbitrary sets. For this general case, we provide necessary and sufficient conditions for such a propagation.

Technical Report UTEP-CS-11-54, September 2011
Updated version UTEP-CS-11-54b, February 2012
Towards Optimizing Cloud Computing: An Example of Optimization under Uncertainty

Published in: Samee U. Khan, Albert Y. Zomaya, and Lizhe Wang (eds.), Scalable Computing and Communications: Theory and Practice, John Wiley & Sons and IEEE Computer Society Press, 2013, pp. 613-627.

One of the most efficient way to store and process data is cloud computing. In cloud computing, instead of storing the data at the user-defined location (e.g., at the user's computer or at the centralized server), the computer system ("cloud") selects the location of the data storage that speeds up computations -- by minimizing the (average) communication time. In this chapter, we provide an analytical solution to the corresponding optimization problem.

The demand for cloud computing is growing fast, and we expect that this demand -- and thus, the size of the resulting cloud -- will continue to grow. To avoid expensive frequent redesigns of the cloud, it is therefore desirable to make sure that the resulting servers will have enough capacity to satisfy future demand -- and at the same time that we do not build in expensive extra capacity that will not be used in the predictable future. It is thus important to be able to predict the future demand for cloud computing -- i.e., predict how the cloud will grow. In this chapter, we describe how to optimally predict the cloud growth.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-53, September 2011
A Heuristic for Selecting the Cycle Length for FAIRIO
S. Arunagiri, Y. Kwok, P. J. Teller, R. Portillo, and S. R. Seelam

FAIRIO is a cycle-based I/O scheduling algorithm that provides differentiated service to workloads concurrently accessing a consolidated RAID storage system. FAIRIO enforces proportional sharing of I/O service through fair scheduling of disk time. During each cycle of the algorithm, I/O requests are scheduled according to workload weights and disk-time utilization history. There are several parameters in the FAIRIO scheduler that can affect its behavior. One parameter in particular, the length of a scheduling cycle, can affect the scheduler's ability to react to and compensate for changes in workload access characteristics. Unfortunately, there is no cycle length that is optimal for all workloads and, thus, it must be chosen according to the workload environment. This technical report describes a heuristic that can be used to choose a favorable cycle length that promotes performance differentiation given a priori knowledge of workload access behavior. The heuristic is validated using simulations driven by several real and synthetic workload scenarios that represent a broad range of request types, sizes, and access characteristics. We show that when workload weights properly map to workload requirements, cycle lengths deduced by our heuristic promote differentiated disk-time sharing within 3.9% of perfect proportionality to workload weights.

Technical Report UTEP-CS-11-52, August 2011
The CleanJava Language for Functional Program Verification
Yoonsik Cheon, Cesar Yeep, and Melisa Vela

Unlike Hoare-style program verification, functional program verification supports forward reasoning by viewing a program as a mathematical function from one program state to another and proving its correctness by essentially comparing two mathematical functions, the function computed by the program and its specification. Since it requires a minimal mathematical background and reflects the way that programmers reason about the correctness of a program informally, it can be taught and practiced effectively. However, there is no formal notation supporting the functional program verification. In this article, we describe a formal notation for writing functional program specifications for Java programs. The notation, called CleanJava, is based on the Java expression syntax and is extended with a mathematical toolkit consisting of sets and sequences. The vocabulary of CleanJava can also be enriched by introducing user-specified definitions such as user-defined mathematical functions and specification-only methods. We believe that CleanJava is a good notation for writing functional specifications and expect it to promote the use of functional program verifications by being able to specify a wide range of Java programs.

Technical Report UTEP-CS-11-51, August 2011
How to Encourage Imperfect Individuals to Care More about Society in General: a Utility-Theory Approach

Published in Applied Mathematical Sciences, 2012, Vol. 6, No. 13, pp. 645-649.

For a society to function efficiently, it is desirable that all members of this society care no only about themselves, but also about the society as a whole, i.e., about all the other individuals from the society. In practice, most people are only capable of caring about a few other individuals. We analyze this problem from the viewpoint of decision theory and show that even with such imperfect individuals, it is possible to make sure that everyone's decisions are affected by the society as a whole: namely, it is sufficient to make sure that people have emotional attachment to those few individuals who are capable of caring about the society as a whole.

As a side effect, our result provides a possible explanation of why the Biblical commandment to love your God encourages ethical behavior.

Technical Report UTEP-CS-11-50, August 2011
Updated version UTEP-CS-11-50a, May 2012
I-Complexity and Discrete Derivative of Logarithms: A Symmetry-Based Explanation

Published in Journal of Uncertain Systems, 2012, Vol. 6, No. 2, pp. 118-121.

In many practical applications, it is useful to consider Kolmogorov complexity K(s) of a given string s, i.e., the shortest length of a program that generates this string. Since Kolmogorov complexity is, in general, not computable, it is necessary to use computable approximations K~(s) to K(s). Usually, to describe such an approximations, we take a compression algorithm and use the length of the compressed string as K~(s). This approximation, however, is not perfect: e.g., for most compression algorithms, adding a single bit to the string $s$ can drastically change the value K~(s) -- while the actual Kolmogorov complexity only changes slightly. To avoid this problem, V. Becher and P. A. Heiber proposed a new approximation called I-complexity. The formulas for this approximation depend on selecting an appropriate function F(x). Empirically, the function F(x) = log(x) works the best. In this paper, we show that this empirical fact can be explained if we take in account the corresponding symmetries.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-49, August 2011
How Accurately Should We Write on the Board? When Marking Comments on Student Papers?
Martine Ceberio and Olga Kosheleva

Published in Journal of Uncertain Systems, 2012, Vol. 6, No. 2, pp. 89-91.

Writing on the board is an important part of a lecture. Lecturers' handwriting is not always perfect. Usually, a lecturer can write slower and more legibly, this will increase understandability but slow down the lecture. In this paper, we analyze an optimal trade-off between speed and legibility.

Technical Report UTEP-CS-11-48, August 2011
Efficient Geophysical Technique of Vertical Line Elements as a Natural Consequence of General Constraints Techniques
Rolando Cardenas and Martine Ceberio

Published in Journal of Uncertain Systems, 2012, Vol. 6, No. 2, pp. 86--88.

One of the main objectives of geophysics is to find how density d and other physical characteristics depend on a 3-D location (x,y,z). In general, in numerical methods, a way to find the dependence d(x,y,z) is to discretize the space, and to consider, as unknown, e.g., values d(x,y,z) on a 3-D rectangular grid. In this case, the desired density distribution is represented as a combination of point-wise density distributions. In geophysics, it turns out that a more efficient way to find the desired distribution is to represent it as a combination of thin vertical line elements that start at some depth and go indefinitely down. In this paper, we show that the empirical success of such vertical line element techniques can be naturally explained if we recall that, in addition to the equations which relate the observations and the unknown density, we also take into account geophysics-motivated constraints.

Technical Report UTEP-CS-11-47, August 2011
A New Justification for Weighted Average Aggregation in Fuzzy Techniques
Jaime Nava

To appear in Journal of Uncertain Systems, 2012, Vol. 6, No. 2.

In many practical situations, we need to decide whether a given solution is good enough, based on the degrees ai to which different criteria are satisfied. In this paper, we show that natural requirements lead to the weighted average decision, according to which a solution is acceptable if w1 * a1 + ... + wn * an > t for some weights wi and threshold t.

Technical Report UTEP-CS-11-46, August 2011
Updated version UTEP-CS-11-46a, March 2012
Prediction in Econometrics: Towards Mathematical Justification of Simple (and Successful) Heuristics
Vladik Kreinovich, Hung T. Nguyen, and Songsak Sriboonchitta

Preliminary version published in Proceedings of the Fifth International Conference of the Thailand Econometric Society, Chiang Mai, Thailand, January 12-13, 2012; full version published in International Journal of Intelligent Technologies and Applied Statistics (IJITAS), 2012, Vol. 5, No. 4, pp. 443-460.

Many heuristic and semi-heuristic methods have been proposed to predict economic and financial processes. Some of these heuristic processes are intuitively reasonable, some seemingly contradict to our intuition. The success of these heuristics leads to a reasonable conjecture that these heuristic methods must have a more fundamental justification. In this paper, we provide such a justification for two simple (and successful) prediction heuristics: of an intuitive exponential smoothing that provides a reasonable prediction for slowly changing processes, and of a seemingly counter-intuitive idea of an increase in volatility as a predictor of trend reversal. As a possible application of these ideas, we consider a new explanation of the price transmission phenomenon.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-45, August 2011
Revised version UTEP-CS-11-45b, January 2013
A Simple Physics-Motivated Equivalent Reformulation of P=NP that Makes This Equality (Slightly) More Plausible

Published in Cybernetics and Physics, 2012, Vol. 1, No. 4, pp. 287-291.

In our opinion, one of the reasons why the problem P=NP? is so difficult is that while there are good intuitive arguments in favor of P=/=NP, there is a lack of intuitive arguments in favor of P=NP. In this paper, we provide such an argument -- based on the fact that in physics, many dependencies are scale-invariant, their expression does not change if we simply change the unit in which we measure the corresponding input quantity (e.g., replace meters by centimeters). It is reasonable to imagine similar behavior for time complexity tA(n) of algorithms A: that the form of this dependence does not change if we change change the unit in which we measure the input length (e.g., from bits to bytes). One can then easily prove that the existence of such scale-invariant algorithms for solving, e.g., propositional satisfiability is equivalent to P=NP. This equivalent reformulation of the formula P=NP is, in our opinion, much more intuitively reasonable than the original formula -- at least to those who are familiar with the importance of scale-invariance in physics.

Original file in pdf
Revised version in pdf

Technical Report UTEP-CS-11-44, August 2011
High-Concentration Chemical Computing Techniques for Solving Hard-To-Solve Problems, and Their Relation to Numerical Optimization, Neural Computing, Reasoning Under Uncertainty, and Freedom Of Choice

Published in: Evgeny Katz (ed.), Molecular and Supramolecular Information Processing: From Molecular Switches to Logical Systems, Wiley-VCH, Wienheim, Germany, 2012, pp. 209-235.

Chemical computing -- using chemical reactions to perform computations -- is a promising way to solve computationally intensive problems. Chemical computing is promising because it has the potential of using up to 10^{23} molecules as processors working in parallel -- and thus, has a potential of an enormous speedup. Unfortunately, for hard-to-solve (NP-complete) problems a natural chemical computing approach for solving them is exponentially slow. In this chapter, we show that the corresponding computations can become significantly faster if we use very-high-concentration chemical reactions, concentrations at which the usual equations of chemical kinetics no longer apply. We also show that the resulting method is related to numerical optimization, neural computing, reasoning under uncertainty, and freedom of choice.

Technical Report UTEP-CS-11-43, August 2011
Updated version UTEP-CS-11-43a, August 2011
All Kinds of Behavior are Possible in Chemical Kinetics: A Theorem and Its Potential Applications to Chemical Computing

Published in: Evgeny Katz (ed.), Molecular and Supramolecular Information Processing: From Molecular Switches to Logical Systems, Wiley-VCH, Wienheim, Germany, 2012, pp. 237-258.

Until the late 1950s, it was believed that the processes described by the equations of chemical kinetics are simple: in the course of each chemical reaction, concentrations of some chemical substances decrease while concentrations of other substances increase. This belief was shattered when the first periodic reaction -- the famous Belousov-Zhabotinsky reaction -- was discovered. Since then, it was shown that many other types of unusual behavior are possible for chemical systems. This discovery led to the possibility of finding chemical reactions that emulate non-trivial transformations that occur during computations -- and thus, perform computations "in vitro", by actually performing the corresponding chemical reactions. The potential advantages of such chemical computing are numerous; the main advantage is that with 10^{23} molecules performing computations in parallel, we have a potential for an unheard-of-parallelization -- and thus, of an unheard-of speed-up. The possibility of computing "in vitro" was at first only theoretically conjectured, but then, in 1994, L. Adleman has actually performed successful chemical computations. This started a current boom in chemical computing, with many new ideas and devices appearing all the time.

From both practical and theoretical viewpoints, chemical computing has been a clear success story. However, one open problem remained in this area: while many types of behavior have been shown to occur in chemical kinetics, it has been not know whether all types of behavior are possible. In this paper, we prove that every possible behavior can indeed be implemented in an appropriate chemical kinetics system. This result has the following direct implication for chemical computing: no matter what computational device one invents, with whatever weird behavior, it is, in principle, possible to emulate this device by appropriate chemical reactions. In this sense, chemical computing is truly ubiquitous.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-42, August 2011
Linear Neural Networks Revisited: From PageRank to Family Happiness

The study of Artificial Neural Networks started with the analysis of linear neurons. It was then discovered that networks consisting only of linear neurons cannot describe non-linear phenomena. As a result, most currently used neural networks consist of non-linear neurons. In this paper, we show that in many cases, linear neurons can still be successfully applied. This idea is illustrated by two examples: the PageRank algorithm underlying the successful Google search engine and the analysis of family happiness.

Technical Report UTEP-CS-11-41, August 2011
Updated version UTEP-CS-11-41b, August 2013
Density-Based Fuzzy Clustering as a First Step to Learning Rules: Challenges and Possible Solutions

Published in: Mo Jamshidi, Vlaidk Kreinovich, and Janusz Kacprzyk (eds.), Proceedings of 3rd World Conference on Soft Computing, San Antonio, December 15-18, 2013, pp. 357-372.

In many practical situations, it is necessary to cluster given situations, i.e., to divide them into groups so that situations within each group are similar to each other. This is how we humans usually make decisions: instead of taking into account all the tiny details of a situation, we classify the situation into one of the few groups, and then make a decision depending on the group containing a given situation. When we have many situations, we can describe the probability density of different situations. In terms of this density, clusters are connected sets with higher density separated by sets of smaller density. It is therefore reasonable to define clusters as connected components of the set of all the situations in which the density exceeds a certain threshold t. This idea indeed leads to reasonable clustering. It turns out that the resulting clustering works best if we use a Gaussian function for smoothing when estimating the density, and we select a threshold in a certain way. In this paper, we provide a theoretical explanation for this empirical optimality. We also show how the above clustering algorithm can be modified so that it takes into account that we are not absolutely sure whether each observed situation is of the type in which we are interested, and takes into account that some situations "almost" belong to a cluster.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-40, July 2011
Updated version UTEP-CS-11-40a, August 2013

Why A Model Produced by Training a Neural Network Is Often More Computationally Efficient than a Nonlinear Regression Model: A Theoretical Explanation

Published in Journal of Uncertain Systems, 2014, Vol. 8, No. 3, pp. 193-204.

Many real-life dependencies can be reasonably accurately described by linear functions. If we want a more accurate description, we need to take non-linear terms into account. To take nonlinear terms into account, we can either explicitly add quadratic terms to the regression equation, or, alternatively, we can train a neural network with a non-linear activation function. At first glance, regression algorithms lead to simpler expressions, but in practice, often, a trained neural network turns out to be a more computationally efficient way of predicting the corresponding dependence. In this paper, we provide a reasonable explanation for this empirical fact.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-39, July 2011
Maximum Likelihood Approach to Pointwise Estimation in Statistical Data Processing under Interval Uncertainty
Nitaya Buntao, Sa-aat Niwitpong, and Vladik Kreinovich

Traditional statistical estimates C(x1, ..., xn) for different statistical characteristics (such as mean, variance, etc.) implicitly assume that we know the sample values x1, ..., xn exactly. In practice, the sample values Xi come from measurements and are, therefore, in general, different from the actual (unknown) values Xi of the corresponding quantities. Sometimes, we know the probabilities of different values of the measurement error ΔXi = Xi - xi, but often, the only information that we have about the measurement error is the upper bound Δi on its absolute value -- provided by the manufacturer of the corresponding measuring instrument. In this case, the only information that we have about the actual values xi is that they belong to the intervals [Xi - Δi, Xi + Δi].

In general, different values xi from the corresponding interval [Xi - Δi, Xi + Δi] lead to different values of the corresponding statistical characteristic C(x1, ..., xn). In this case, it is desirable to find the set of all possible values of this characteristic. For continuous estimates C(x1, ..., xn), this range is an interval.

The values of C are used, e.g., in decision making -- e.g., in a control problem, to select an appropriate control value. In this case, we need to select a single value from the corresponding interval. It is reasonable to select a value which is, in some sense, the most probable. In this paper, we show how the Maximum Likelihood approach can provide such a value, i.e., how it can produce pointwise estimates in statistical data processing under interval uncertainty.

Technical Report UTEP-CS-11-38, July 2011
Updated version UTEP-11-38a, January 2012
Towards Symmetry-Based Explanation of (Approximate) Shapes of Alpha-Helices and Beta-Sheets (and Beta-Barrels) in Protein Structure

Published in Symmetry, 2012, Vol. 4, No. 1, pp. 15-25.

Protein structure is invariably connected to protein function. There are two important secondary structure elements: alpha helices and beta-sheets (which sometimes come in a shape of beta-barrels). The actual shapes of these structures can be complicated, but in the first approximation, they are usually approximated by, correspondingly, cylindrical spirals and planes (and cylinders, for beta-barrels). In this paper, following the ideas pioneered by a renowned mathematician M. Gromov, we use natural symmetries to show that, under reasonable assumptions, these geometric shapes are indeed the best approximating families for secondary structures.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-37, July 2011
Theoretical Explanation of Bernstein Polynomials' Efficiency: They Are Optimal Combination of Optimal Endpoint-Related Functions

In many applications of interval computations, it turned out to be beneficial to represent polynomials on a given interval [x-, x+] as linear combinations of Bernstein polynomials (x- x - )k * (x+ - x)n-k. In this paper, we provide a theoretical explanation for this empirical success: namely, we show that under reasonable optimality criteria, Bernstein polynomials can be uniquely determined from the requirement that they are optimal combinations of optimal polynomials corresponding to the interval's endpoints.

Technical Report UTEP-CS-11-36, July 2011
Towards Fast and Accurate Algorithms for Processing Fuzzy Data: Interval Computations Revisited

Published in International Journal of General Systems, 2013, Vol. 42, No. 2, pp. 197-223.

In many practical applications, we need to process data -- e.g., to predict the future values of different quantities based on their current values. Often, the only information that we have about the current values comes from experts, and is described in informal ("fuzzy") terms like "small". To process such data, it is natural to use fuzzy techniques, techniques specifically designed by Lotfi Zadeh to handle such informal information.

In this survey, we start by revisiting the motivation behind Zadeh's formulas for processing fuzzy data, explain how the algorithmic problem of processing fuzzy data can be described in terms of interval computations (alpha-cuts). Many fuzzy practitioners claim I tried interval computations, they did not work" -- meaning that they got estimates which are much wider than the desired alpha-cuts. We show that such statements are usually based on a (widely spread) misunderstanding -- that interval computations simply means replacing each arithmetic operation with the corresponding operation with intervals. We show that while such straightforward interval techniques indeed often lead to over-wide estimates, the current advanced interval computations techniques result in estimates which are much more accurate.

We overview such advanced interval computations techniques, and show that by using them, we can efficiently and accurately process fuzzy data.

We wrote this survey with three audiences in mind. First, we want fuzzy researchers and practitioners to understand the current advanced interval computations techniques and to use them to come up with faster and more accurate algorithms for processing fuzzy data. For this "fuzzy" audience, we explain these current techniques in detail. Second, we also want interval researchers to better understand this important application area for their techniques. For this "interval" audience, we want to explain where fuzzy techniques come from, what are possible variants of these techniques, and what are the problems to which interval techniques can be applied. These readers needs to avoid their own frequent misunderstanding -- that fuzzy techniques are "magical" heuristic tools that are only justified by intuition and that have no mathematical justification. Readers of both types can skip the parts they already know.

Finally, we also want to target people who solve practical data processing problems -- and who may not be well familiar neither with the fuzzy nor with the interval techniques. We want these readers to get an understanding of both the problem of processing fuzzy data and of the interval techniques for solving this problem.

Technical Report UTEP-CS-11-35, June 2011
First revised version UTEP-CS-11-35b, August 2012
Second version UTEP-CS-11-35c, September 2012
Final version UTEP-CS-11-35d, January 2013
Is It Possible to Have a Feasible Enclosure-Computing Method Which Is Independent of the Equivalent Form?

Published in Reliable computing, 2013, Vol. 18, pp. 1-8.

The problem of computing the range [y] of a given function f(x1, ..., xn) over given intervals [xi] -- often called the main problem of interval computations -- is, in general, NP-hard. This means that unless P = NP, it is not possible to have a feasible (= polynomial time) algorithm that always computes the desired range. Instead, interval computations algorithms compute an enclosure [Y] for the desired range. For all known feasible enclosure-computing methods -- starting with straightforward interval computations -- there exist two expressions f(x1, ..., xn) and g(x1, ..., xn) for computing the same function that lead to different enclosures. We prove that, unless P = NP, this is inevitable: it is not possible to have a feasible enclosure-computing method which is independent of the equivalent form.

Original file in pdf
First revised version in pdf
Second revised version in pdf
Final version in pdf

Technical Report UTEP-CS-11-34, June 2011
Orthogonal Bases Are the Best: A Theorem Justifying Bruno Apolloni's Heuristic Neural Network Idea

Published in Journal of Uncertain Systems, 2012, Vol. 6, No. 2, pp. 122-127.

One of the main problems with neural networks is that they are often very slow in learning the desired dependence. To speed up neural networks, Bruno Apolloni proposed to othogonalize neurons during training, i.e., to select neurons whose output functions are orthogonal to each other. In this paper, we use symmetries to provide a theoretical explanation for this heuristic idea.

Technical Report UTEP-CS-11-33, June 2011
Tropical (Idempotent) Algebras as a Way to Optimize Fuzzy Control
Jaime Nava

Published in International Journal of Innovative Management, Information & Production (IJIMIP), 2013, Vol. 4, No. 1, pp. 72-87.

Fuzzy control is a methodology that transforms control rules (described by an expert in words of a natural language) into a precise control strategy. There exist several versions of this transformation. The main difference between these versions is in how they interpret logical connectives "and" and "or", i.e., in other words, what reasoning method a version uses. Which of these versions should we choose? It turns out that on different stages of control, different reasoning methods lead to better control results. In this paper, we describe the choice of reasoning methods that optimize control results in terms of smoothness and stability. It turns out that reasoning methods which are optimal on each stage correspond to tropical algebras -- algebras isomorphic to the set of real numbers with operations plus and maximum.

Technical Report UTEP-CS-11-32, June 2011
Towards a "Generic" Notion of Genericity: From "Typical" and "Random" to Meager, Shy, etc.
Ali Jalal-Kamali, Ondrej Nebesky, Michael H. Durcholz, Vladik Kreinovich, and Luc Longpre

Published in Journal of Uncertain Systems, 2012, Vol. 6, No. 2, pp. 104-113.

In many application areas, it is important to study "generic" properties, i.e., properties which hold for typical'' examples. For example, if we know the probabilities of different events, we can consider a "random" object -- i.e., an object that, crudely speaking, does not belong to any class of "unusual" events (i.e., to any class with a small probability). In other cases, "typical" may mean not belonging to an "unusual" subset which is small in some other sense -- e.g., a subset of a smaller dimension. The corresponding notion of "typicalness" has been formalized for several cases, including the case of random events. In this case, the known Kolmogorov-Martin-Lof definition of randomness captures the idea that properties with probability 0 are impossible. In our previous papers, we modified this definition to take into account that from a practical viewpoint, properties with very small probabilities are often considered impossible as well. In this paper, we extend this definition to a general notion of "generic".

Technical Report UTEP-CS-11-31, June 2011
Revised version UTEP-CS-11-31b, September 2011
No-Free-Lunch Result for Interval and Fuzzy Computing: When Bounds Are Unusually Good, Their Computation is Unusually Slow

Published in: I. Batyrshin and G. Sidorov (eds.), Proceedings of the 10th Mexican International Conference on Artificial Intelligence MICAI'2011, Puebla, Mexico, November 26 - December 4, 2011, Springer Lecture Notes in Artificial Intelligence, Vol. 7095, pp. 13-23.

On several examples from interval and fuzzy computations and from related areas, we show that when the results of data processing are unusually good, their computation is unusually complex. This makes us think that there should be an analog of Heisenberg's uncertainty principle well known in quantum mechanics: when we an unusually beneficial situation in terms of results, it is not as perfect in terms of computations leading to these results. In short, nothing is perfect.

Original file in pdf
updated version in pdf

Technical Report UTEP-CS-11-30c, May 2012
Updated version UTEP-CS-11-30d, May 2012
Failure Analysis of a Complex System Based on Partial Information about Subsystems, with Potential Applications to Aircraft Maintenance
Vladik Kreinovich, Christelle Jacob, Didier Dubois, Janette Cardoso, and Martine Ceberio

Published in Applied and Computational Mathematics, 2012, Vol. 11, No. 2, pp. 165-179.

In many real-life applications (e.g., in aircraft maintenance), we need to estimate the probability of failure of a complex system (such as an aircraft as a whole or one of its subsystems). Complex systems are usually built with redundancy allowing them to withstand the failure of a small number of components. In this paper, we assume that we know the structure of the system, and, as a result, for each possible set of failed components, we can tell whether this set will lead to a system failure. For each component $A$, we know the probability P(A) of its failure with some uncertainty: e.g., we know the lower and upper bounds for this probability. Usually, it is assumed that failures of different components are independent events. Our objective is to use all this information to estimate the probability of failure of the entire the complex system. In this paper, we describe several methods for solving this problem, including a new efficient method for such estimation based on Cauchy deviates.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-30, June 2011
Updated version UTEP-CS-11-30b, August 2011
Estimating Probability of Failure of a Complex System Based on Inexact Information about Subsystems and Components, with Potential Applications to Aircraft Maintenance
Vladik Kreinovich, Christelle Jacob, Didier Dubois, Janette Cardoso, Martine Ceberio, and Ildar Batyrshin

Published in: I. Batyrshin and G. Sidorov (eds.), Proceedings of the 10th Mexican International Conference on Artificial Intelligence MICAI'2011, Puebla, Mexico, November 26 - December 4, 2011, Springer Lecture Notes in Artificial Intelligence, Vol. 7095, pp. 70-81.

In many real-life applications (e.g., in aircraft maintenance), we need to estimate the probability of failure of a complex system (such as an aircraft as a whole or one of its subsystems). Complex systems are usually built with redundancy allowing them to withstand the failure of a small number of components. In this paper, we assume that we know the structure of the system, and, as a result, for each possible set of failed components, we can tell whether this set will lead to a system failure. For each component A, we know the probability P(A) of its failure with some uncertainty: e.g., we know the lower and upper bounds for this probability. Usually, it is assumed that failures of different components are independent events. Our objective is to use all this information to estimate the probability of failure of the entire the complex system. In this paper, we describe a new efficient method for such estimation based on Cauchy deviates.

Original file in pdf
updated version in pdf

Technical Report UTEP-CS-11-29, June 2011
Estimating Mean and Variance under Interval Uncertainty: Dynamic Case

Published in Proceedings of the 2011 Sixth International Conference on Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control ICSCCW'2011, Antalya, Turkey, September 1-2, 2011, pp. 85-93.

In many practical situations, it is important to estimate the mean E and the variance V from the sample values x1, ..., xn. Usually, in statistics, we consider the case when the parameters like E and V do not change with time and when the sample values xi are known exactly. In practice, the values xicome from measurements, and measurements are never 100% accurate. In many cases, we only know the upper bound Di on the measurement error. In this case, once we know the measured value Xi, we can conclude that the actual (unknown) value xi belongs to the interval [Xi - Di, Xi + Di]. Different values xi from these intervals lead, in general, to different values of E and V. It is therefore desirable to find the ranges [E] and [V] of all possible values of E and V. While this problem is, in general, NP-hard, in many practical situations, there exist efficient algorithms for computing such ranges.

In practice, processes are dynamic. As a result, reasonable estimates for E and V assign more weight to more recent measurements and less weight to the past ones. In this paper, we extend known algorithms for computing the ranges [E] and [V] to such dynamic estimates.

Technical Report UTEP-CS-11-28, June 2011
Product of Partially Ordered Sets (Posets), with Potential Applications to Uncertainty Logic and Space-Time Geometry
Francisco Zapata, Olga Kosheleva, and Karen Villaverde

Published in International Journal of Innovative Management, Information & Production (IJIMIP), 2011, Vol. 2, No. 4, pp. 10-28.

One of the main objectives of science and engineering is to help people select the most beneficial decisions. To make these decisions,

• we must know people's preferences,
• we must have the information about different events -- possible consequences of different decisions, and
• since information is never absolutely accurate and precise, we must also have information about the degree of certainty.
All these types of information naturally lead to partial orders:
• For preferences, a < b means that b is preferable to a. This relation is used in decision theory.
• For events, a < b means that a can influence b. This causality relation is used in space-time physics.
• For uncertain statements, a < b means that a is less certain than b. This relation is used in logics describing uncertainty such as fuzzy logic.
In many practical situations, we are analyzing a complex system that consists of several subsystems. Each subsystem can be described as a separate ordered space. To get a description of the system as a whole, we must therefore combine these ordered spaces into a single space that describes the whole system.

In this paper, we consider the general problem of how to combine two ordered spaces A and B into one. We also analyze which properties of orders are preserved under the resulting products.

Technical Report UTEP-CS-11-27a, June 2011
Revised version UTEP-CS-11-27c, February 2015

Short version published in Proceedings of the 2011 Sixth International Conference on Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control ICSCCW'2011, Antalya, Turkey, September 1-2, 2011, pp. 21-29; extended version published in International Journal of Knowledge Engineering and Soft Data Paradigms, 2014, Vol. 4, No. 4, pp. 327-338.

In the traditional (static) fuzzy logic approach, we select an "and"-operation (t-norm) and an "or"-operation (t-conorm). The result of applying these selected operations may be somewhat different from the actual expert's degrees of belief in the corresponding logical combinations A & B and A \/ B of the original statements -- since these degrees depend not only on the expert's degrees of belief in statements A and B, but also in the extent to which the statements A and B are dependent. We show that dynamic fuzzy logic enables us to automatically take this dependence into account -- and thus, leads to more adequate "and"- and "or"-operations.

Original file in pdf
Revised version in pdf

Technical Report UTEP-CS-11-26, May 2011
Functional Verification of Class Invariants in CleanJava
Carmen Avila and Yoonsik Cheon

In a Cleanroom-style functional program verification, a program is viewed as a mathematical function from one program state to another, and a verification is done by comparing two functions, the implemented and the expected behaviors of a program. The technique requires a minimal mathematical background and supports forward reasoning, but it doesn't support assertions such as class invariants. However, assertions such as class invariants are not only a practical programming tool but also play a key role in the correctness proof of a program by specifying conditions and constraints that an object has to satisfy and thus defining valid states of an object. We suggest a way to integrate the notion of a class invariant in a functional program verification by using CleanJava as a specification notation and a verification framework. CleanJava is a formal annotation language for Java to support a Cleanroom-style functional program verification. We propose a small extension to CleanJava to specify a class invariant and also to its proof logic to verify the invariant. Our extension closely reflects the way programmers specify and reason about the correctness of a program informally. It allows one to use a class invariant in the framework of a Cleanroom-style functional specification and verification.

Technical Report UTEP-CS-11-25, May 2011
Updated version UTEP-CS-11-25a, October 2011
Towards Interval Techniques for Model Validation

Published in Computing, 2012, Vol. 94, No. 2, pp. 257-269.

Most physical models are approximate. It is therefore important to find out how accurate are the predictions of a given model. This can be done by validating the model, i.e., by comparing its predictions with the experimental data. In some practical situations, it is difficult to directly compare the predictions with the experimental data, since models usually contain (physically meaningful) parameters, and the exact values of these parameters are often not known. One way to overcome this difficulty is to get a statistical distribution of the corresponding parameters. Once we substitute these distributions into a model, we get statistical predictions -- and we can compare the resulting probability distribution with the actual distribution of measurement results. In this approach, we combine all the measurement results, and thus, we are ignoring the information that some of these results correspond to the same values of the parameters -- e.g., they come from measuring the same specimen under different conditions. In this paper, we propose an interval approach that takes into account this important information. This approach is illustrated on the example of a benchmark thermal problem presented at the Sandia Validation Challenge Workshop (Albuquerque, New Mexico, May 2006).

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-24, May 2011
Updated version UTEP-CS-11-24a, April 2012
When Is Busemann Product a Lattice? A Relation Between Metric Spaces and Corresponding Space-Time Models
Hans-Peter Kuenzi, Francisco Zapata, and Vladik Kreinovich

Published in Applied Mathematical Sciences, 2012, Vol. 6, No. 66, pp. 3267-3276.

The causality relation of special relativity is based on the assumption that the speed of all physical processes is limited by the speed of light. As a result, an event (t,x) occurring at moment t at location x can influence an event (y,s) if and only if s >= t+d(x,y)/c. We can simplify this formula if we use units of time and distance in which c=1 (e.g., by using a light second as a unit of distance). In this case, the above causality relation takes the form s >= t+d(x,y). Since the actual space can be non-Euclidean, H. Busemann generalized this ordering relation to the case when points x, y, etc. are taken from an arbitrary metric space X. A natural question is: when is the resulting ordered space -- called a Busemann product -- a lattice? In this paper, we provide a necessary and sufficient condition for it being a lattice: it is a lattice if and only if X is a real tree, i.e., a metric space in which every two points are connected by exactly one arc, and this arc is geodesic (i.e., metrically isomorphic to an interval on a real line).

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-23, May 2011
Estimating Probability of Failure of a Complex System Based on Partial Information about Subsystems and Components, with Potential Applications to Aircraft Maintenance
Christelle Jacob, Didier Dubois, Janette Cardoso, Martine Ceberio, and Vladik Kreinovich

Published in Proceedings of the International Workshop on Soft Computing Applications and Knowledge Discovery SCAKD'2011, Moscow, Russia, June 25, 2011, pp. 30-41.

In many real-life applications (e.g., in aircraft maintenance), we need to estimate the probability of failure of a complex system (such as an aircraft as a whole or one of its subsystems). Complex systems are usually built with redundancy allowing them to withstand the failure of a small number of components. In this paper, we assume that we know the structure of the system, and, as a result, for each possible set of failed components, we can tell whether this set will lead to a system failure. In some cases, for each component A, we know the probability P(A) of its failure; in other cases, however, we only know the lower and upper bounds for this probability. Sometimes, we only have expert estimates for these probabilities, estimates that can be described as fuzzy numbers.

Usually, it is assumed that failures of different components are independent events, but sometimes, we know that they are dependent -- in which case we usually do not have any specific information about their correlation. Our objective is to use all this information to estimate the probability of failure of the entire the complex system. In this paper, we describe methods for such estimation.

Technical Report UTEP-CS-11-22, May 2011
Updated version UTEP-CS-11-22a, August 2011
How to Tell When a Product of Two Partially Ordered Spaces Has a Certain Property?
Francisco Zapata, Olga Kosheleva, and Karen Villaverde

Published in Journal of Uncertain Systems, 2012, Vol. 6, No. 2, pp. 152-160.

In this paper, we describe how checking whether a given property F is true for a product A1 X A2 of partially ordered spaces can be reduced to checking several related properties of the original spaces Ai.

This result can be useful in the analysis of properties of intervals [a,b] = {x: a <= x <= b} over general partially ordered spaces -- such as the space of all vectors with component-wise order or the set of all functions with component-wise ordering f <= g <--> for all x (f(x) <= g(x)). When we consider sets of pairs of such objects A1 X A2, it is natural to define the order on this set in terms of orders in A1 and A2 -- this is, e.g., how ordering and intervals are defined on the set R2 of all 2-D vectors.

This result can also be useful in the analysis of ordered spaces describing different degrees of certainty in expert knowledge.

Original file in pdf
Revised version in pdf

Technical Report UTEP-CS-11-21, May 2011
Revised version UTEP-CS-11-21a, June 2011
Why Fuzzy Transform Is Efficient in Large-Scale Prediction Problems: A Theoretical Explanation

Published in Advances in Fuzzy Systems, Vol. 2011, Article ID 985839, doi:10.1155/2011/985839.

In many practical situations like weather prediction, we are interested in large-scale (averaged) value of the predicted quantities. For example, it is impossible to predict the exact future temperature at different spatial locations, but we can reasonably well predict average temperature over a region. Traditionally, to obtain such large-scale predictions, we first perform a detailed integration of the corresponding differential equation, and then average the resulting detailed solution. This procedure is often very time-consuming, since we need to process all the details of the original data.

In our previous papers, we have shown that similar quality large-scale prediction results can be obtained if instead, we apply a much faster procedure -- first average the inputs (by applying an appropriate fuzzy transform), and then use these averaged inputs to solve the corresponding (discretization of the) differential equation.

In this paper, we provide a general theoretical explanation of why our semi-heuristic method works, i.e., why fuzzy transforms are efficient in large-scale predictions.

Original file in pdf
Updated version in pdf

Technical Report UTEP-CS-11-20, May 2011
Towards Optimal Knowledge Processing: From Centralization Through Cyberinsfrastructure To Cloud Computing
Octavio Lerma, Eric Gutierrez, Chris Kiekintveld, and Vladik Kreinovich

Published in International Journal of Innovative Management, Information & Production (IJIMIP), 2011, Vol. 2, No. 2, pp. 67-72.

One of the most efficient way to store and process data is cloud computing, when we store the data so as to minimize the expenses and increase the efficiency. In this paper, we provide an analytical solution to the corresponding optimization problem.

Technical Report UTEP-CS-11-19, May 2011
Joggler: Data Harvest And Analysis Tool
Ondrej Nebesky

Increase in use of stand-alone systems for recording data has made data harvesting across several fields easier and faster. However, raw data collected from such systems need to be manipulated and processed to enable meaningful analysis. Although data are readily available, one major issue concerning analysts and scientists is collation of data from various sources. Without a standard data format, scientists and analysts are required to put in resources to bring in data from multiple sources together. The problem is aggravated when a particular data source changes its data format.

This project introduces the Joggler framework for a collection of services and inter- faces to enable analysts to quickly process data and bring them to other application with minimum of coding. The Joggler framework has been demonstrated on various projects, including weather data visualization and data visualization from the Jornada experimental site.

Technical Report UTEP-CS-11-18, May 2011
Uniqueness of Reconstruction for Yager's t-Norm Combination of Probabilistic and Possibilistic Knowledge

Published in International Journal of Intelligent Systems, 2012, Vol. 27, No. 1, pp. 16-22.

Often, about the same real-life system, we have both measurement-related probabilistic information expressed by a probability measure P(S) and expert-related possibilistic information expressed by a possibility measure M(S). To get the most adequate idea about the system, we must combine these two pieces of information. For this combination, R. Yager -- borrowing an idea from fuzzy logic -- proposed to use a t-norm f(a,b)$such as the product f(a,b)=a*b, i.e., to consider a set function f(S)=f(P(S),M(S)). A natural question is: can we uniquely reconstruct the two parts of knowledge from this function f(S)? In our previous paper, we showed that such a unique reconstruction is possible for the product t-norm; in this paper, we extend this result to a general class of t-norms. Technical Report UTEP-CS-11-17, April 2011 Revised version UTEP-CS-11-17a, May 2011 Linear-Time Resource Allocation in Security Games with Identical Fully Protective Resources Octavio Lerma, Vladik Kreinovich, and Christopher Kiekintveld Published in Proceedings of the AAAI Workshop on Applied Adversarial Reasoning and Risk Modeling AARM'11, San Francisco, California, August 7, 2011 Game theory has become an important tools for making resource allocations decision in security domains, including critical infrastructure protection. Many of these games are formulated as Stackelberg security games. We present new analysis and algorithms for a class of Stackelberg security games with identical, fully protective defender resources. The first algorithm has worst-case complexity linear in the number of possible targets, but works only for a restricted case. The second algorithm can find and optimal resource allocation for the general case in time O(n log(n)). Original file in pdf updated version in pdf Technical Report UTEP-CS-11-16, April 2011 Revised version UTEP-CS-11-16a, July 2011 Final version UTEP-CS-11-16b, August 2011 Reconstructing an Open Order from Its Closure, with Applications to Space-Time Physics and to Logic Francisco Zapata and Vladik Kreinovich Published in Studia Logica, 2012, Vol. 100, No. 1-2, pp. 419-435. In his logical papers, Leo Esakia studied corresponding ordered topological spaces and order-preserving mappings. Similar spaces and mappings appear in many other application areas such the analysis of causality in space-time. It is known that under reasonable conditions, both the topology and the original order relation can be uniquely reconstructed if we know the "interior" < of the order relation. It is also known that in some cases, we can uniquely reconstruct < (and hence, topology) from the original order relation. In this paper, we show that, in general, under reasonable conditions, the open order < (and hence, the corresponding topology) can be uniquely determined from its closure. Original file in pdf Revised version in pdf Final version in pdf Technical Report UTEP-CS-11-15, March 2011 Knowledge Annotations in Scientific Workflows: An Implementation in Kepler Aida Gandara, George Chin Jr., Paulo Pinheiro da Silva, Signe White, Chandrika Sivaramakrishnan, and Terence Critchlow Scientific research products are the result of long-term collaborations between teams. Scientific workflows are capable of helping scientists in many ways including collecting information about how research was conducted (e.g., scientific workflow tools often collect and manage information about datasets used and data transformations). However, knowledge about why data was collected is rarely documented in scientific workflows. In this paper we describe a prototype system built to support the collection of scientific expertise that influences scientific analysis. Through evaluating a scientific research effort underway at the Pacific Northwest National Laboratory, we identified features that would most benefit PNNL scientists in documenting how and why they conduct their research, making this information available to the entire team. The prototype system was built by enhancing the Kepler Scientific Workflow System to create knowledge-annotated scientific workflows and to publish them as semantic annotations. Technical Report UTEP-CS-11-14, March 2011 Updated version UTEP-CS-11-14a, June 2011 Processing Interval Sensor Data in the Presence of Outliers, with Potential Applications to Localizing Underwater Robots Jan Sliwka, Luc Jaulin, Martine Ceberio, and Vladik Kreinovich Published in Proceedings of the 2011 IEEE International Conference on Systems, Man, and Cybernetics SMC'2011, Anchorage, Alaska, October 9-12, 2011, pp. 2330-2337. Measurements are never absolutely accurate, the measurement result X is, in general, different from the actual (unknown) values x of the corresponding quantity. In many practical problems, we only know upper bounds D on the measurement errors d = X - x. In such situations, once we know the measurement result, the only conclusion that we can make about the actual value x is that this value belongs to the interval [X - D, X + D]. There exist many efficient algorithms for processing such interval data. However, these algorithms usually assume that all the measurement results are valid. In reality, due to factors such as sensor malfunction, some measurement results may be way off (outliers), for which the difference between X and x is much larger than the upper bound D on the measurement error. In this paper, we overview the algorithmic problems related to processing interval sensor data in the presence of outliers. Our case study -- for which we develop and analyze these algorithms -- is localization of underwater robots, a problem in which a significant number of measurement results are outliers. Original file in pdf Updated version in pdf Technical Report UTEP-CS-11-13, March 2011 How to Combine Probabilistic and Possibilistic (Expert) Knowledge: Uniqueness of Reconstruction in Yager's (Product) Approach Nitaya Buntao and Vladik Kreinovich Published in International Journal of Innovative Management, Information & Production (IJIMIP), 2011, Vol. 2, No. 1, pp. 1-8. Often, about the same real-life system, we have both measurement-related probabilistic information expressed by a probability measure P(S) and expert-related possibilistic information expressed by a possibility measure M(S). To get the most adequate idea about the system, we must combine these two pieces of information. For this combination, R. Yager -- borrowing an idea from fuzzy logic -- proposed to use the simple product t-norm, i.e., to consider a set function f(S) = P(S) * M(S). A natural question is: can we uniquely reconstruct the two parts of knowledge from this function f(S)? In this paper, we prove that while in the discrete case, the reconstruction is often not unique, in the continuous case, we can always uniquely reconstruct both components P(S) and M(S) from the combined function f(S). In this sense, Yager's combination is indeed an adequate way to combine the two parts of knowledge. Technical Report UTEP-CS-11-12, March 2011 Estimating Risk of Extreme and Catastrophic Events under Interval Uncertainty Nitaya Buntao and Vladik Kreinovich In many application areas, we encounter heavy-tail distributions -- for example, such distributions are ubiquitous in financial applications. These distributions are often described by Pareto law. There exist techniques for estimating the parameters of such the corresponding Pareto distributions based on the sample x1, ..., xn. In practice, we often only know the values xi with interval uncertainty. In this paper, we show how to estimate the parameters of the Pareto distribution under such uncertainty and how to describe deviation and dependence for general heavy-tailed distributions. Technical Report UTEP-CS-11-11, March 2011 Estimating Covariance for Privacy Case under Interval (and Fuzzy) Uncertainty Ali Jalal-Kamali, Vladik Kreinovich, and Luc Longpre Published in: Ronald R. Yager, Marek Z. Reformat, Shahnaz N. Shahbazova, and Sergei Ovchinnikov (eds.), Proceedings of the World Conference on Soft Computing, San Francisco, CA, May 23-26, 2011. One of the main objectives of collecting data in statistical databases (medical databases, census databases) is to find important correlations between different quantities. To enable researchers to looks for such correlations, we should allow them them to ask queries testing different combinations of such quantities. However, when we receive answers to many such questions, we may inadvertently disclose information about individual patients, information that should be private. One way to preserve privacy in statistical databases is to store {\it ranges} instead of the original values. For example, instead of an exact age of a patient in a medical database, we only store the information that this age is, e.g., between 60 and 70. This idea solves the privacy problem, but it make statistical analysis more complex. Different possible values from the corresponding ranges lead, in general, to different values of the corresponding statistical characteristic; it is therefore desirable to find the range of all such values. It is known that for mean and variance, there exist feasible algorithms for computing such ranges. In this paper, we show that similar algorithms are possible for another important statistical characteristic -- covariance, whose value is important in computing correlations. Technical Report UTEP-CS-11-10, February 2011 From Processing Interval-Valued Fuzzy Data to General Type-2: Towards Fast Algorithms Vladik Kreinovich Published in Proceedings of the IEEE Symposium on Advances in Type-2 Fuzzy Logic Systems T2FUZZ'2011, part of the IEEE Symposium Series on Computational Intelligence, Paris, France, April 11-15, 2011, pp. ix-xii. It is known that processing of data under general type-1 fuzzy uncertainty can be reduced to the simplest case -- of interval uncertainty: namely, Zadeh's extension principle is equivalent to level-by-level interval computations applied to alpha-cuts of the corresponding fuzzy numbers. However, type-1 fuzzy numbers may not be the most adequate way of describing uncertainty, because they require that an expert can describe his or her degree of confidence in a statement by an exact value. In practice, it is more reasonable to expect that the expert estimates his or her degree by using imprecise words from natural language -- which can be naturally formalized as fuzzy sets. The resulting type-2 fuzzy numbers more adequately represent the expert's opinions, but their practical use is limited by the seeming computational complexity of their use. It turns out that for the practically important case of interval-valued fuzzy sets, processing such sets can also be reduced to interval computations -- and that this idea can be naturally extended to arbitrary type-2 fuzzy numbers. Technical Report UTEP-CS-11-09, February 2011 Revised version UTEP-CS-11-09b, March 2011 Towards Faster Estimation of Statistics and ODEs Under Interval, P-Box, and Fuzzy Uncertainty: From Interval Computations to Rough Set-Related Computations Vladik Kreinovich Published in In: Sergey O. Kuznetsov et al. (Eds.) Proceedings of the Thirteenth International Conference on Rough Sets, Fuzzy Sets and Granular Computing RSFDGrC'2011 (Moscow, Russia, June 25-27, 2011), Springer Lecture Notes on Artificial Intelligence LNAI, Springer-Verlag, Berlin, Heidelberg, 2011, Vol. 6743, pp. 3-10. Proceedings of the Thirteenth International Conference on Rough Sets, Fuzzy Sets and Granular Computing RSFDGrC'2011, Moscow, Russia, June 25-27, 2011. Interval computations estimate the uncertainty of the result of data processing in situations in which we only know the upper bounds$\Delta\$ on the measurement errors. In interval computations, at each intermediate stage of the computation, we have intervals of possible values of the corresponding quantities. As a result, we often have bounds with excess width. One way to remedy this problem is to extend interval technique to rough-set computations, where on each stage, in addition to intervals of possible values of the quantities, we also keep rough sets representing possible values of pairs (triples, etc.).

Original file in pdf
Revised version in pdf

Technical Report UTEP-CS-11-08, January 2011
Towards a General Description of Translation-Invariant and Translation-Covariant Linear Transformations: A Natural Justification of Fourier Transforms and Fuzzy Transforms

Published in Proceedings of the World Congress of the International Fuzzy Systems Association IFSA'2011, Surabaya and Bali Island, Indonesia, June 21-25, 2011.

In many practical situations, we are interested in the dependencies that do not change with time, i.e., that do not change when we change the origin of the time axis. The corresponding translation-invariant transformations are easy to describe: they correspond to convolutions, or, equivalently, to fuzzy transforms.

It turns out that if we relax the invariance condition and require only that the transformation be translation-covariant (i.e., that it appropriately changes under translation), we get exactly two classes of transformations: Fourier transforms and fuzzy transforms. This result explain why both transforms have been successfully used in data processing.

Technical Report UTEP-CS-11-07, January 2011
Measures of Deviation (and Dependence) for Heavy-Tailed Distributions and heir Estimation under Interval and Fuzzy Uncertainty

Published in: Ronald R. Yager, Marek Z. Reformat, Shahnaz N. Shahbazova, and Sergei Ovchinnikov (eds.), Proceedings of the World Conference on Soft Computing, San Francisco, CA, May 23-26, 2011.

Traditionally, in science and engineering, most statistical techniques are based on the assumption that the random variables are normally distributed. For such distributions, a natural characteristic of the "average" value is the mean, and a natural characteristic of the deviation from the average is the variance. However, in many practical situations, e.g., in economics and finance, we encounter probability distributions for which the variance is infinite; such distributions are called heavy-tailed. For such distributions, we describe which characteristics can be used to describe the average and the deviation from the average, and how to estimate these characteristics under interval and fuzzy uncertainty. We also discuss what are the reasonable analogues of correlation for such heavy-tailed distributions.

Technical Report UTEP-CS-11-06, January 2011
PWiseGen: Generating Test Cases for Pairwise Testing Using Genetic Algorithms
Pedro Flores and Yoonsik Cheon

Pairwise testing is a combinatorial testing technique that tests all possible pairs of input values. Although, finding a smallest set of test cases for pairwise testing is NP-complete, pairwise testing is regarded as a reasonable cost-benefit compromise among combinatorial testing methods. In this paper we formulate the problem of finding a pairwise test set as a search problem and apply a genetic algorithm to solve it. We also describe an open-source tool called PWiseGen for generating pairwise test sets. According to our experiments, PWiseGen produces competitive results compared with existing pairwise testing tools. Besides, it provides a framework and a research platform for generating pairwise test sets using genetic algorithms; it is configurable, extensible, and reusable.

Technical Report UTEP-CS-11-05, January 2011
Towards Optimal Few-Parametric Representation of Spatial Variation: Geometric Approach and Environmental Applications
Misha Koshelev, Octavio Lerma, and Craig Tweedie

Published in Geombinatorics, 2011, Vol. 21, No. 1, pp. 15-24.

In this paper, we use geometric approach to show that under reasonable assumption, the spatial variability of a field f(x), i.e., the expected value F(z)=E[(f(x+z)-f(x))2], has the form F(z)=|Σ gij*zi*zj|α. We explain how to find gij and α from the observations, and how to optimally place sensors in view of this spatial variability.

Technical Report UTEP-CS-11-04, January 2011
Updated version UTEP-CS-11-04a, March 2011
Modified Fourier-Motzkin Elimination Algorithm for Reducing Systems of Linear Inequalities with Unconstrained Parameters
Mario Bencomo, Luis Gutierrez, and Martine Ceberio

The need for eliminating redundancies in systems of linear inequalities arises in many applications. In linear programming (LP), one seeks a solution that optimizes a given linear objective function subject to a set of linear constraints, sometimes posed as linear inequalities. Linear inequalities also arise in the context of tensor decomposition. Due to the lack of uniqueness in higher-order tensor decomposition, non-negativity constraints are imposed on the decomposition factors, yielding systems of linear inequalities. Eliminating redundancies in such systems can reduce the number of computations, and hence improve computation times in applications.

Current techniques for eliminating redundant inequalities are not viable in higher dimensions. As an alternative we propose a modified version of the Fourier-Motzkin Elimination Algorithm (ModFMEA), implemented in MatLab, to reduce redundancies in a given system of linear constraints over reals posed as linear inequalities. Reduction is obtained, at each orthant containing the solution set, by taking the lower and upper bounds of x_i-normalized inequalities x_i >= l and u >= x_i respectively, where l and u are linear terms with no occurrence of x_i, for i = 1,2,...,N. The reduced system over the whole solution set can be obtained by taking the union of the reduced system at each orthant.

This method eliminates redundancies by retaining a system of linear inequalities that define the set of feasible solutions. It works under the assumption that all of the variables are unconstrained, i.e., variables may take on negative and positive values. Experimental results demonstrate reduction of systems in higher dimensions for both bounded and unbounded solution sets with feasible computational times, and provide important hindsight into its workings that allows us to design an extension of ModFMEA (ExModFMEA) that yields even more reduction.

Original file in pdf
updated version in pdf

Technical Report UTEP-CS-11-03, January 2011
Updated version UTEP-CS-11-03a, March 2011
Optimizing Trajectories for Unmanned Aerial Vehicles (UAVs) Patrolling the Border
Chris Kiekintveld, Vladik Kreinovich, and Octavio Lerma

Published in: Ronald R. Yager, Marek Z. Reformat, Shahnaz N. Shahbazova, and Sergei Ovchinnikov (eds.), Proceedings of the World Conference on Soft Computing, San Francisco, CA, May 23-26, 2011.

At first glance, most aspects of border protection activity look like classical examples of zero-sum games, in which the interests of the two sides are exactly opposite. This is how such situations are planned now: this is how border patrol agents are assigned to different segments of the border, this is how routes of coast guard ships are planned, etc. However, there is a big difference between such situations and the traditional zero-sum games: in the traditional zero-sum games, it is assumed that we know the exact objective function of each participant; in contrast, in border protection planning (e.g., in counter-terrorism planning), the adversary's objective function is rarely known in precise terms; at best, we have the description of this objective function in terms of words from natural language. In this paper, on an example of an UAV patrolling the border, we show how fuzzy techniques can help in planning border protection strategies under such uncertainty.

Original file in pdf
updated version in pdf

Technical Report UTEP-CS-11-02, January 2011
Designing, understanding, and analyzing unconventional computation: the important role of logic and constructive mathematics

Published in Applied Mathematical Sciences, 2012, Vol. 6, No. 13, pp. 629-644.

In this paper, we explain why, in our opinion, logic and constructive mathematics are playing -- and should play -- an important role in the design, understanding, and analysis of unconventional computation.

Technical Report UTEP-CS-11-01, January 2011
Revised version UTEP-CS-11-01a, August 2011
Computation in quantum space-time could lead to a super-polynomial speedup