Computer Science Department

Abstracts of 2018 Reports

Technical Report UTEP-CS-18-26, March 2018

Working on One Part at a Time is the Best Strategy for Software Production: A Proof

Francisco Zapata, Malileh Zargaran, and Vladik Kreinovich

When a company works on a large software project, it can often start recouping its investments by selling intermediate products with partial functionality. With this possibility in mind, it is important to schedule work on different software parts so as to maximize the profit. These exist several algorithms for solving the corresponding optimization problem, and in all the resulting plans, at each moment of time, we work on one part of software at a time. In this paper, we prove that this one-part-at-a-time property holds for all optimal plans.

Technical Report UTEP-CS-18-25, March 2018

Towards Foundations of Fuzzy Utility: Taking Fuzziness into Account Naturally Leads to Intuitionistic Fuzzy Degrees

Christian Servin and Vladik Kreinovich

The traditional utility-based decision making theory assumes that for every two alternatives, the user is either absolutely sure that the first alternative is better, or that the second alternative is better, or that the two alternatives are absolutely equivalent. In practice, when faced with alternatives of similar value, people are often not fully sure which of these alternatives is better. To describe different possible degrees of confidence, it is reasonable to use fuzzy logic techniques. In this paper, we show that, somewhat surprisingly, a reasonable fuzzy modification of the traditional utility elicitation procedure naturally leads to intuitionistic fuzzy degrees.

Technical Report UTEP-CS-18-24, March 2018

How to Gauge Repair Risk?

Francisco Zapata and Vladik Kreinovich

At present, there exist several automatic tools that, given a software, find locations of possible defects. A general tool does not take into account a specificity of a given program. As a result, while many defects discovered by this tool can be truly harmful, many uncovered alleged defects are, for this particular software, reasonably (or even fully) harmless. A natural reaction is to repair all the alleged defects, but the problem is that every time we correct a program, we risk introducing new faults. From this viewpoint, it is desirable to be able to gauge the repair risk. This will help use decide which part of the repaired code is most likely to fail and thus, needs the most testing, and even whether repairing a probably harmless defect is worth an effort at all -- if as a result, we increase the probability of a program malfunction. In this paper, we analyze how repair risk can be gauged.

Technical Report UTEP-CS-18-23, March 2018

How Intelligence Community Interprets Imprecise (Fuzzy) Words, and How to Justify This Empirical-Based Interpretation

Olga Kosheleva and Vladik Kreinovich

To provide a more precise meaning to imprecise (fuzzy) words like "probable" or "almost certain", researchers analyzed how often intelligence predictions hedged by each corresponding word turned out to be true. In this paper, we provide a theoretical explanation for the resulting empirical frequencies.

Technical Report UTEP-CS-18-22, March 2018

How to Explain Empirical Distribution of Software Defects by Severity

Francisco Zapata, Olga Kosheleva, and Vladik Kreinovich

In the last decades, several tools have appeared that, given a software package, mark possible defects of different potential severity. Our empirical analysis has shown that in most situations, we observe the same distribution or software defects by severity. In this paper, we present this empirical distribution, and we use interval-related ideas to provide an explanation for this empirical distribution.

Technical Report UTEP-CS-18-21, March 2018

How to Explain the Results of the Richard Thaler's 1997 Financial Times Contest

Olga Kosheleva and Vladik Kreinovich

In 1997, by using a letter published in Financial Times, Richard H. Thaler, the 2017 Nobel Prize winner in Economics, performed the following experiment: he asked readers to submit numbers from 0 to 100, so that the person whose number is the closest to 2/3 of the average will be the winner. An intuitive answer is to submit 2/3 of the average (50), i.e., 33 1/3. A logical answer, as can be explained, is to submit 0. The actual winning submission was -- depending on how we count -- 12 or 13. In this paper, we propose a possible explanation for this empirical result.

Technical Report UTEP-CS-18-20, March 2018

Why Superforecasters Change Their Estimates on Average by 3.5%: A Possible Theoretical Explanation

Olga Kosheleva and Vladik Kreinovich

\
A recent large-scale study of people's forecasting ability has
shown that there is a small group of *superforecasters*, whose
forecasts are significantly more accurate than the forecasts of an
average person. Since forecasting is important in many application
areas, researchers have studied what exactly the supreforecasters
do differently -- and how we can learn from them, so that we will
be able to forecast better. One empirical fact that came from this
study is that, in contrast to most people, superforecasters make
much smaller adjustements to their probability estimates. On
average, their average probability change is 3.5%. In this paper,
we provide a possible theoretical explanation for this empirical
value.

Technical Report UTEP-CS-18-19, March 2018

Virtual Agent Interaction Framework (VAIF): A Tool for Rapid Development of Social Agents

Ivan Gris and David Novick

Creating an embodied virtual agent is often a complex process. It involves 3D modeling and animation skills, advanced programming knowledge, and in some cases arti.cial intelligence or the integration of complex interaction models. Features like lip-syncing to an audio .le, recognizing the users’ speech, or having the character move at certain times in certain ways, are inaccessible to researchers that want to build and use these agents for education, research, or industrial uses. VAIF, the Virtual Agent Interaction Framework, is an extensively documented system that attempts to bridge that gap and provide inexperienced researchers the tools and means to develop their own agents in a centralized, lightweight platform that provides all these features through a simple interface within the Unity game engine. In this paper we present the platform, describe its features, and provide a case study where agents were developed and deployed in mobile-device, virtual-reality, and augmented-reality platforms by users with no coding experience.

Technical Report UTEP-CS-18-18, February 2018

Reverse Mathematics Is Computable for Interval Computations

Martine Ceberio, Olga Kosheleva, and Vladik Vladik Kreinovich

For systems of equations and/or inequalities under interval uncertainty, interval computations usually provide us with a box whose all points satisfy this system. Reverse mathematics means finding necessary and sufficient conditions, i.e., in this case, describing the set of {\it all} the points that satisfy the given system. In this paper, we show that while we cannot always exactly describe this set, it is possible to have a general algorithm that, given ε > 0, provides an ε-approximation to the desired solution set.

Technical Report UTEP-CS-18-17, February 2018

Italian Folk Multiplication Algorithm Is Indeed Better: It Is More Parallelizable

Martine Ceberio, Olga Kosheleva, and Vladik Kreinovich

Traditionally, many ethnic groups had their own versions of arithmetic algorithms. Nowadays, most of these algorithms are studied mostly as pedagogical curiosities, as an interesting way to make arithmetic more exciting to the kids: by applying to their patriotic feelings -- if they are studying the algorithms traditionally used by their ethic group -- or simply to their sense of curiosity. Somewhat surprisingly, we show that one of these algorithms -- a traditional Italian multiplication algorithm -- is actually in some reasonable sense better than the algorithm that we all normally use -- namely, it is easier to parallelize.

Technical Report UTEP-CS-18-16, February 2018

From Traditional Neural Networks to Deep Learning: Towards Mathematical Foundations of Empirical Successes

Vladik Kreinovich How do we make computers think? To make machines that fly, it is reasonable to look at the creatures that know how to fly: the birds. To make computers think, it is reasonable to analyze how we think -- this is the main origin of neural networks. At first, one of the main motivations was speed -- since even with slow biological neurons, we often process information fast. The need for speed motivated traditional 3-layer neural networks. At present, computer speed is rarely a problem, but accuracy is -- this motivated deep learning. In this paper, we concentrate on the need to provide mathematical foundations for the empirical success of deep learning.

Technical Report UTEP-CS-18-15, February 2018

How to Monitor Possible Side Effects of Enhanced Oil Recovery Process

Jose Manuel Dominguez Esquivel, Solymar Ayala Cortez, Aaron Velasco, and Vladik Kreinovich

To extract all the oil from a well, petroleum engineers pump hot reactive chemicals into the well. This enhanced oil recovery process needs to be thoroughly monitored, since the aggressively hot liquid can seep out and, if unchecked, eventually pollute the sources of drinking water. At present, to monitor this process, engineers measure the seismic waves generated when the liquid fractures the minerals. However, the resulting seismic waves are weak in comparison with the background noise. Thus, the accuracy with which we can locate the spreading liquid based on these weak signals is low, and we get only a very crude approximate understanding of how the liquid propagates. To get a more accurate picture of the liquid propagation, we propose to use active seismic analysis: namely, we propose to generate strong seismic waves and use a large-N array of sensors to observe their propagation.

Technical Report UTEP-CS-18-14, February 2018

Optimization of Quadratic Forms and t-norm Forms on Interval Domain and Computational Complexity

Milan Hladik, Michal Cerny, and Vladik Kreinovich

To appear in *Proceedings of the World Conference on Soft
Computing*, Baku, Azerbaijan, May 29-31, 2018.

We consider the problem of maximization of a
quadratic form over a box. We identify the NP-hardness boundary
for sparse quadratic forms: the problem is polynomially
solvable for O(log n) nonzero entries, but it is NP-hard if the
number of nonzero entries is of the order n^{ε} for
an arbitrarily
small ε > 0. Then we inspect further polynomially solvable
cases. We define a sunflower graph over the quadratic form
and study efficiently solvable cases according to the shape of
this graph (e.g. the case with small sunflower leaves or the
case with a restricted number of negative entries). Finally, we
define a generalized quadratic form, called t-norm form, where
the quadratic terms are replaced by t-norms. We prove that
the optimization problem remains NP-hard with an arbitrary
Lipschitz continuous t-norm.

Technical Report UTEP-CS-18-13, February 2018

Which t-Norm Is Most Appropriate for Bellman-Zadeh Optimization

Vladik Kreinovich, Olga Kosheleva, and Shahnaz Shahbazova

To appear in *Proceedings of the World Conference on Soft
Computing*, Baku, Azerbaijan, May 29-31, 2018.

In 1970, Richard Bellman and Lotfi Zadeh proposed a method for finding the maximum of a function under fuzzy constraints. The problem with this method is that it requires the knowledge of the minimum and the maximum of the objective function over the corresponding crisp set, and minor changes in this crisp set can lead to a drastic change in the resulting maximum. It is known that if we use a product "and"-operation (t-norm), the dependence on the maximum disappears. Natural questions are: what if we use other t-norms? Can we eliminate the dependence on the minimum? What if we use a different scaling in our derivation of the Bellman-Zadeh formula? In this paper, we provide answers to all these questions. It turns out that the product is the only t-norm for which there is no dependence on maximum, that it is impossible to eliminate the dependence on the minimum, and we also provide t-norms corresponding to the use of general scaling functions.

Technical Report UTEP-CS-18-12, February 2018

When Is Data Processing Under Interval and Fuzzy Uncertainty Feasible: What If Few Inputs Interact? Does Feasibility Depend on How We Describe Interaction?

Milan Hladik, Michal Cerny, and Vladik Kreinovich

It is known that, in general, data processing under interval and
fuzzy uncertainty is NP-hard -- which means that, unless P = NP,
no feasible algorithm is possible for computing the accuracy of
the result of data processing. It is also known that the
corresponding problem becomes feasible if the inputs do not
interact with each other, i.e., if the data processing algorithm
computes the sum of n functions, each depending on only one of
the $n$ inputs. In general, inputs x_{i} and
x_{j} interact. If we
take into account all possible interactions, and we use bilinear
functions x_{i} *
x_{j} to describe this interaction, we get an
NP-hard problem. This raises two natural questions: what if only a
few inputs interact? What if the interaction is described by some
other functions? In this paper, we show that the problem remains
NP-hard if we use different formulas to describe the inputs'
interaction, and it becomes feasible if we only have O(log(n))
interacting inputs -- but remains NP-hard of the number of inputs
is O(n^{ϵ}) for any ϵ > 0.

Technical Report UTEP-CS-18-11, February 2018

Why Skew Normal: A Simple Pedagogical Explanation

Jose Guadalupe Flores Muniz, Vyacheslav V. Kalashnikov, Nataliya Kalashnykova, Olga Kosheleva, and Vladik Kreinovich

In many practical situations, we only know a few first moments of a random variable, and out of all probability distributions which are consistent with this information, we need to select one. When we know the first two moments, we can use the Maximum Entropy approach and get normal distribution. However, when we know the first three moments, the Maximum Entropy approach doe snot work. In such situations, a very efficient selection is a so-called skew normal distribution. However, it is not clear why this particular distribution should be selected. In this paper, we provide an explanation for this selection.

Technical Report UTEP-CS-18-10, February 2018

A New Kalman Filter Model for Nonlinear Systems Based on Ellipsoidal Bounding

Ligang Sun, Hamza Alkhatib, Boris Kargoll, Vladik Kreinovich, and Ingo Neumann

In this paper, a new fiter model called set-membership Kalman filter for nonlinear state estimation problems was designed, where both random and unknown but bounded uncertainties were considered simultaneously in the discrete-time system. The main loop of this algorithm includes one prediction step and one correction step with measurement information, and the key part in each loop is to solve an optimization problem. The solution of the optimization problem produces the optimal estimation for the state, which is bounded by ellipsoids. The new filter was applied on a highly nonlinear benchmark example and a two-dimensional simulated trajectory estimation problem, in which the new filter behaved better compared with extended Kalman filter results. Sensitivity of the algorithm was discussed in the end.

Technical Report UTEP-CS-18-09, February 2018

Why 70/30 or 80/20 Relation Between Training and Testing Sets: A Pedagogical Explanation

Afshin Gholamy, Vladik Kreinovich, and Olga Kosheleva

When learning a dependence from data, to avoid overfitting, it is important to divide the data into the training set and the testing set. We first train our model on the training set, and then we use the data from the testing set to gauge the accuracy of the resulting model. Empirical studies show that the best results are obtained if we use 20-30% of the data for testing, and the remaining 70-80% of the data for training. In this paper, we provide a possible explanation for this empirical result.

Technical Report UTEP-CS-18-08, February 2018

Why Learning Has Aha-Moments and Why We Should Also Reward Effort, Not Just Results

Gerargo Uranga, Vladik Kreinovich, and Olga Kosheleva

Traditionally, in machine learning, the quality of the result improves steadily with time (usually slowly but still steadily). However, as we start applying reinforcement learning techniques to solve complex tasks -- such as teaching a computer to play a complex game like Go -- we often encounter a situation in which for a long time, then is no improvement, and then suddenly, the system's efficiency jumps almost to its maximum. A similar phenomenon occurs in human learning, where it is known as the aha-moment. In this paper, we provide a possible explanation for this phenomenon, and show that this explanation leads to the need to reward students for effort as well, not only for their results.

Technical Report UTEP-CS-18-07, February 2018

Why Burgers Equation: Symmetry-Based Approach

Leobardo Valera, Martine Ceberio, and Vladik Kreinovich

In many application areas ranging from shock waves to acoustics, we encounter the same partial differential equation known as the Burgers' equation. The fact that the same equation appears in different application domains, with different physics, makes us conjecture that it can be derived from the fundamental principles. Indeed, in this paper, we show that this equation can be uniquely determined by the corresponding symmetries.

Technical Report UTEP-CS-18-06, February 2018

Lotfi Zadeh: a Pioneer in AI, a Pioneer in Statistical Analysis, a Pioneer in Foundations of Mathematics, and a True Citizen of the World

Vladik Kreinovich

Everyone knows Lotfi Zadeh as the Father of Fuzzy Logic. There have been -- and will be -- many papers on this important topic. What I want to emphasize in this paper is that his ideas go way beyond fuzzy logic:

- he was a pioneer in AI;
- he was a pioneer in statistical analysis; and
- he was a pioneer in foundations of mathematics.

Technical Report UTEP-CS-18-05, January 2018

Type-2 Fuzzy Analysis Explains Ubiquity of Triangular and Trapezoid Membership Functions

Olga Kosheleva, Vladik Kreinovich, and Shahnaz Shahbazova

To appear in *Proceedings of the World Conference on Soft
Computing*, Baku, Azerbaijan, May 29-31, 2018.

In principle, we can have many different membership functions. Interestingly, however, in many practical applications, triangular and trapezoidal membership functions are the most efficient ones. In this paper, we use fuzzy approach to explain this empirical phenomenon.

Technical Report UTEP-CS-18-04, January 2018

How Many Monte-Carlo Simulations Are Needed to Adequately Process Interval Uncertainty: An Explanation of the Smart Electric Grid-Related Simulation Results

Afshin Gholamy and Vladik Kreinovich

Published in *Journal of Innovative Technology and Education*,
2018, Vol. 5, No. 1, pp. 1-5.

One of the possible ways of dealing with interval uncertainty is to use Monte-Carlo simulations. A recent study of using this technique for the analysis of different smart electric grid-related algorithms shows that we need approximately 500 simulations to compute the corresponding interval range with 5% accuracy. In this paper, we provide a theoretical explanation for these empirical results.

Technical Report UTEP-CS-18-03, January 2018

Measures of Specificity Used in the Principle of Justifiable Granularity: A Theoretical Explanation of Empirically Optimal Selections

Olga Kosheleva and Vladik Kreinovich

To process huge amounts of data, one possibility is to combine
some data points into granules, and then process the resulting
granules. For each group of data points, if we try to include all
data points into a granule, the resulting granule often becomes
too wide and thus rather useless; on the other case, if the
granule is too narrow, it includes only a few of the corresponding
point -- and is, thus, also rather useless. The need for the
trade-off between coverage and specificity is formalized as the
*principle of justified granularity*. The specific form of
this principle depends on the selection of a measure of
specificity. Empirical analysis has show that exponential and
power law measures of specificity are the most adequate. In this
paper, we show that natural symmetries explain this empirically
observed efficiency.

Technical Report UTEP-CS-18-02, January 2018

How to Efficiently Compute Ranges Over a Difference Between Boxes, With Applications to Underwater Localization

Luc Jaulin, Martine Ceberio, Olga Kosheleva, and Vladik Kreinovich

When using underwater autonomous vehicles, it is important to
localize them. Underwater localization is very approximate. As a
result, instead of a single location x, we get a set X of
possible locations of a vehicle. Based on this set of possible
locations, we need to find the range of possible values of the
corresponding objective function f(x). For missions on the ocean
floor, it is beneficial to take into account that the vehicle is
in the water, i.e., that the location of this vehicle is *not*
in a set X' describing the under-floor matter. Thus, the actual
set of possible locations of a vehicle is a difference set X−X'.
So, it is important to find the ranges of different functions over
such difference sets. In this paper, we propose an effective
algorithm for solving this problem.

Technical Report UTEP-CS-18-01, January 2018

How to Detect Crisp Sets Based on Subsethood Ordering of Normalized Fuzzy Sets? How to Detect Type-1 Sets Based on Subsethood Ordering of Normalized Interval-Valued Fuzzy Sets?

Christian Servin, Olga Kosheleva, and Vladik Kreinovich

If all we know about normalized fuzzy sets is which set is a subset of which, will we be able to detect crisp sets? It is known that we can do it if we allow all possible fuzzy sets, including non-normalized ones. In this paper, we show that a similar detection is possible if we only allow normalized fuzzy sets. We also show that we can detect type-1 fuzzy sets based on the subsethood ordering of normalized interval-valued fuzzy sets.