CS 5354/4365 Topic in Intelligent Computing/Topics in Soft Computing: Fall 2022

TOPICS IN INTELLIGENT COMPUTING/TOPICS IN SOFT COMPUTING:
COMPUTATIONAL INTELLIGENCE FOR ENGINEERING SOLUTIONS: INVARIANCE-BASED APPROACH
Syllabus for the course CS 5354/4365, Fall 2022

CLASS TIME: TR 12-1:20 pm

INSTRUCTOR: Vladik Kreinovich, email vladik@utep.edu, office CCSB 3.0404, office phone (915) 747-6951.

The instructor's office hours are Tuesdays and Thursdays 1:30-3 pm, or by appointment.
Preferable way of contact is email to vladik@utep.edu
If you want to contact during the scheduled office hours, there is no need to schedule an appointment.
If you are not available during the instructor's scheduled office hours, please schedule an appointment in the following way:
- use the instructor's appointments page https://www.cs.utep.edu/vladik/appointments.html to find the time when the instructor is not busy (i.e., when he has no other appointments), and
- send him an email, to vladik@utep.edu, indicating the day and time that you would like to meet.
He will then send a reply email, usually confirming that he is available at this time, and he will place the meeting with you on his schedule.
If the meeting is scheduled, but something happened and you cannot come, please let the instructor know about it as soon as possible.

PREREQUISITES:

for graduate students: no special pre-requisites; graduate level standing is sufficient;
for undergraduate students: ideally, Statistics and Linear Algebra, but this is not required, we will recall the needed material anyway.

CONTENTS

Introduction: formulation of the problems. The main purpose of computers is to process real-life data, so that we will able:

to understand the current state of the system and
to predict its future behavior.

In some situations -- e.g., in basic mechanics -- we have fundamental from-first-principles laws that enable us to make the corresponding predictions. However, in many other situations, especially in engineering, we only have approximate empirical formulas. For example, it is not possible to predict, based on the first principles, how pavement will deteriorate with time, or how people will change their opinions about goods.

In such situations, we face the following problems:

Why these formulas? Users are usually reluctant to use purely empirical formulas, since there is no guarantee that theses formulas will work in a new situations. It is therefore desirable to come up with theoretical explanations for these formulas.
Maybe these formulas are not the best. Within these theoretical explanations, are the current formulas most adequate? And if they are not the best, what are the better formulas?
What next? Empirical formulas are usually approximate. If we want a more accurate description, we need more complex, more detailed formulas. Of course, the ultimate test is comparing with the observations and measurement results. In view of the theoretical explanations, what are good candidates for such more complex formulas?

Similarly, in many engineering applications, there are semi-empirical methods for solving the corresponding problems. In such cases, similar problems appear:

Why these methods? Users are usually reluctant to use purely empirical methods, since there is no guarantee that these methods will work in a new situations. It is therefore desirable to come up with theoretical explanations for these methods.
Maybe these methods are not the best. Within these theoretical explanations, are the current methods the most adequate? And if not, what are the better methods?
What next? Empirical methods are usually imperfect. If we want better results, we need more complex, more detailed methods. Of course, the ultimate test is testing these methods on the real data. In view of the theoretical explanations, what are good candidates for such more complex methods?

First topic: how to solve these problems -- the idea of invariance. A natural way to make predictions is to look what happened in similar situations in the past. And what does "similar" mean? It means that some important features of the current situations are the same (or at least almost the same) as the same important features of the past situation. In other words, there may have been some changes between the two situations, but the important features did not change.

In mathematical terms:

changes are called transformations,
if a feature does not change under a transformation, we say that this feature is invariant under this transformation, and
transformations under which some features are invariant are known as symmetries.

Not surprisingly, symmetries and invariances are, at present, one of most effective tools in theoretical physics and in many other disciplines. In line with this reasoning, in this class, we will study the invariance-based approach to solving the above problems.

Materials: Section 1 of Paper 6.

A simple example. How do we know that if we drop a pen, it will start falling down with the acceleration of 9.81 m/sec²? Because we -- and others -- repeated this experiment in many different locations, at many moments of time, and always observed the same result.

In this case, the current situation can be obtained from the previous one by shifts in space and time, maybe by rotation -- so the observed phenomenon is invariant with respect to all these transformations.

Second topic: what transformations we will study in this class. Some transformations studied in physics are very complicated, such as changing particles to corresponding anti-particles. In this class, we will focus on the simplest and most natural transformations.

These transformation come from the fact that:

while we want to process the actual values of different quantities like acceleration,
in practice, we deal with numerical values.

For most physical quantities, numerical values depend on the choice of the measuring unit. For example, if instead of meters, we use centimeters, then the same acceleration of 9.81 m/sec² becomes described by a different numerical value 981 cm/sec².

For some quantities like temperature or time, there is no fixed starting point, so numerical values also depend on what starting point we choose. For example, the difference between Fahrenheit and Celsius scales for measuring temperature is that these two scaled use different measuring units and different starting points.

Materials: Section 2 of Paper 6.

Third topic: corresponding invariances. In many situations, there is no fixed measuring unit, the choice of a measuring unit is simply a matter of convention. In this case, it is reasonable to assume that the desired empirical formula has the same form no matter what measuring unit we select.

Of course, each formula relates the values of several quantities. So, if we change the measuring unit for one of the quantities, we need to appropriately change the measuring unit for other quantities. As an example, let us consider the formula d = v * t that describes the traveled distance d as a function of velocity v and time t. This formula remains true whether we measure distance in kilometers or in miles. However, for this formula to remain true when we switch from miles to kilometers, we also need to change the units for measuring velocity from miles per hour to kilometers per hour.

Similarly, in some formulas, if we change the starting point for measuring one of the quantities, we may need to appropriately re-scale other quantities.

Materials: Section 2 of Paper 6.

Fourth topic: which dependencies are invariant with respect to these transformations. This is what we will study first: which functions are invariant with respect to changing the measuring unit and changing the starting point. We will show that, in effect, the only invariant dependencies are:

the power law y = A * x^b,
the exponential dependence y = A * exp(b * x),
the logarithmic dependence y = A * log(x) + b, and
the linear dependence y = A * x + b.

We will also describe engineering situations where such dependencies appear, including:

the description of how people change their ratings of different goods,
inverse distance weighting in geosciences, and
the use of so-called soft-max in deep learning.

In all the above cases, x and y are two different -- but related -- quantities. We will also consider a special case when x and y are related values of the same quantity. In this case, the corresponding invariance will explain the appearance of Rectified Linear Unit (ReLU) y = max(0,x) -- the main activation function in deep neural networks.

Materials:

Sections 2, 4, 4.1, 4.3, 4.6, and Appendix of Paper 6; see also detailed proof.
Paper 7.
Paper 3.
Sections 1, 2, 5, and 7 of Paper 4.

Fifth topic: optimization and how it is related to invariance. Several of the above-described problems are about looking for the best formula or the best method, i.e., about optimization. In situations when we have a symmetry -- i.e., a transformation with respect to which important features are invariant -- it is reasonable to assume that the relative quality of different alternatives (formulas or algorithms) should not change if we simply change the measuring unit and/or the starting point of the corresponding quantity.

We will show that in this case, the optimal alternative should also be invariant. This explains the effectiveness of invariant empirical formulas in engineering -- and, in particular, the effectiveness of ReLU units and of soft-max in deep learning.

Materials: Sections 2, 5, and 7 of Paper 4.

Sixth topic: what next -- possible ideas. As we have mentioned, often, simple invariant transformations provide only an approximation to the actual dependence.

In some cases, it is not a very accurate approximation, it is desirable to have more accurate formulas.
In other cases, it is a reasonably accurate approximation, but still a more accurate formula is desirable.

In both cases, it is desirable to come up with more accurate formulas. To come up with such formulas, let us recall what simplifying assumptions we made in the above description of invariant dependencies y = f(x) -- so that we can think of weakening these assumptions and thus, coming up with more accurate formulas.

We made the following assumptions:

that the quantity y directly depends on the quantity x;
that there is only one way how x affects y;
that we are looking for a universal formula y = f(x) describing all possible situations; and
that the only invariances are invariances with respect to the change of the measuring unit and/or starting point.

In practice, all these assumptions may be violated, which leads to four directions in which more complex formulas can appear.

Materials: Section 3 of Paper 6.

Seventh topic: what if the dependence of y on x is indirect. In many practical situations, the dependence on the desired quantity y on the known quantity x is indirect:

y depends on some auxiliary quantity z, and
this auxiliary quantity z, in its turn, depends on x.

We may have two or more such auxiliary quantities.

In this case, the dependence of y on x is described not by one of the above four formulas describing the direct dependence, but by the composition of such formulas. For example, if y = f(z) and z = g(x), then y = f(g(x)).

We will study such composition functions, and we will describe engineering examples where such compositions are indeed experimentally observed.

Materials:

Sections 3, 4.2, 4.4, and 4.5 of Paper 6.
Paper 1.

Eighth topic: what if there are several ways how x affects y. In some situations, there are two or more different ways how x can influence y. We have empirical formulas describing each of these ways -- they usually correspond to specific cases when one of these ways is dominant and other can be ignored.

To describe a general situation, in which all these ways are important, we need to combine the known formulas. We will show how invariance requirements can help select an appropriate combination -- and we will provide engineering examples where the resulting combined formulas work well.

Materials: Sections 3 and 4.6 of Paper 6.

Ninth topic: what if in different situations, the dependence of y on x is different? In such situations, we cannot use a single function, we need to come up with a family of functions -- so that different functions from this family will describe different situations. A natural -- and probably the simplest -- way to describe a family of functions is to consider linear combinations of a few selected functions e₁(x), ..., e_n(x):

f(x) = c₁ * e₁(x) + ... + c_n * e_n(x) In the case when we were looking for a single function, it was reasonable to assume that this function is invariant. Similarly, in this case, it is also reasonable to assume that the family of functions is invariant.

We will study such invariant families. Examples will include a biomedical application: how to describe the growth of a tumor.

Materials:

Section 2 of Paper 5.
Paper 2.

Tenth topic: what if transformations are more complex than changing the measuring unit and/or the starting point? As we have mentioned, in many situations, the transformations are more complex than changing the measuring unit and/or the starting point. In this course, we will consider the simplest case of such complex transformations, when we transform a single quantity (as opposed to, e.g., rotation in a plane, that transforms the numerical values of both coordinates x and y).

It turns out that this case, the transformations are fractionally linear, i.e., have the form y = (a * x + b) / (c * x + d). We will analyze functions invariant under such transformations, and we will show that this explains the efficiency of sigmoid activation function 1/(1 + exp(-x)) that is used both in shallow and in deep neural networks.

Materials: Sections 3 and 7 of Paper 4.

Projects. The main purpose of all this activity is to help engineering applications. Students are therefore required to work on projects.

Projects should be either related to real-life applications, or have a more general theoretical nature. There are two types of projects: a possible project and an ideal project.

A possible project: reviewing a paper. A project may consist of reviewing some related paper(s) -- and presenting this review to the class. It is OK for two or three students to work on the same project.

If you select this project, then you need to first check with the instructor whether the paper is appropriate, to make sure that:

it is related to the class topic,
it is not too complicated, and
that it has some technical contents such as a formula or an algorithm.

If some part of the paper is not clear, do not hesitate to ask the instructor or -- if this paper is related to your research -- your research supervisor.

In your report, please make sure to concentrate on the following questions:

what is the general practical problem for whose solution this paper is aimed;
what was known before this paper and what were the remaining challenges;
what are the results of this paper, and in what sense they are better than what was done before;
what are the remaining challenges -- if any.

Also, in your report, make sure that:

this report includes some technical contents, i.e., a formula or an algorithm, and
that it is clear to other students in the class: all notations are explained, and all terms which are not part of standard CS knowledge are explained.

It is OK to skip some part of the paper which is too complex.

Please send your report to the instructor a week before the scheduled presentation, so that we will be able, if needed, to clarify possibly not very clear parts of the report.

An ideal project. An ideal project should be creative. In such a project, students -- individually or in groups -- will come up with something new.

How is this possible? Invariance-based approach is a new developing topic. If there are some interesting empirical formulas that have no convincing theoretical explanations:

why not try to explain them by using the techniques we learn in class?
or, better yet, why not try to come up with more accurate formulas?

HOMEWORKS. Each topic means home assignments. After the deadline, I will post correct solution to the corresponding home assignment. Since I will be posting correct solutions to homeworks, it does not make any sense to accept late assignments: once an assignment is posted, it make no sense for you to copy it in your own handwriting, this does not indicate any understanding. So, please try to submit your assignments on time.

Things happen. If there is an emergency situation and you cannot submit it on time, let me know, you will then not be penalized -- and I will come up with a similar but different assignment that you can submit when you become available again.

TESTS. There will be three tests, tentatively on September 22, November 3, and November 22, and the final exam on December 6, 1-3:45 pm. If you are unable to attend the test, let me know, I will organize a different version of the text at a time convenient for you.

GRADES: Maximum number of points:

first test: 10
second test: 10
third test: 15
home assignments: 10
final exam: 35
project: 20

(smart projects with ideas that can turn into a serious scientific publication get up to 40 points).

A good project can help but it cannot completely cover possible deficiencies of knowledge as shown on the test and on the homeworks. In general, up to 80 points come from tests and home assignments. So:

to get an A, you must gain, on all the tests and home assignments, at least 90% of the possible amount of points (i.e., at least 72), and also at least 90 points overall;
to get a B, you must gain, on all the tests and home assignments, at least 80% of the possible amount of points (i.e., at least 64), and also at least 80 points overall;
to get a C, you must gain, on all the tests and home assignments, at least 70% of the possible amount of points (i.e., at least 56), and also at least 70 points overall.

WE WILL OVERCOME. Topics that we study in this class are not easy topics, but hopefully, you will all do well, it is not as difficult as many things you have successfully mastered in your classes so far.

SPECIAL ACCOMMODATIONS: If you have a disability and need special accommodations -- e.g., extra time on the exams -- please contact the Center for Accommodations and Support Services (CASS) at 747-5148 or by email to cass@utep.edu. For additional information, please visit the CASS website at http://www.sa.utep.edu/cass. CASS's staff are the only individuals who can validate and if need be, authorize accommodations for students.

SCHOLASTIC DISHONESTY: Any student who commits an act of scholastic dishonesty is subject to discipline. Scholastic dishonesty includes, but not limited to cheating, plagiarism, collusion, submission for credit of any work or materials that are attributable to another person.

Cheating is:

copying from the test paper of another student;
communicating with another student during a test to be taken individually;
giving or seeking aid from another student during a test to be taken individually;
possession and/or use of unauthorized materials during tests (i.e. crib notes, class notes, books, etc.);
substituting for another person to take a test;
falsifying research data, reports, academic work offered for credit.

Plagiarism is:

using someone's work in your assignments without the proper citations;
submitting the same paper or assignment from a different course, without direct permission of instructors.

To avoid plagiarism see: https://www.utep.edu/student-affairs/osccr/_Files/docs/Avoiding-Plagiarism.pdf

Collusion is unauthorized collaboration with another person in preparing academic assignments.

Instructors are required to -- and will -- report academic dishonesty and any other violation of the Standards of Conduct to the Dean of Students.

Notes: When in doubt on any of the above, please contact your instructor to check whether you are following an authorized procedure.

MATERIALS USED IN THE CLASS: there is no textbook; instead, we will use the following papers (listed in the usual alphabetic order):

Pedro Barragan Olague and Vladik Kreinovich, "A Symmetry-Based Explanation for an Empirical Model of Fatigue Damage of Composite Materials", Journal of Uncertain Systems, 2018, Vol. 12, No. 3, pp. 176-179. pdf file
Pedro Barragan Olagues and Vladik Kreinovich, "Why growth of cancerous tumors is Gompertzian: a symmetry-based explanation", Cybernetics and Physics, 2017, Vol. 6, No. 1, pp. 13-18. pdf file
Laxman Bokati, Aaron Velasco, and Vladik Kreinovich, "Scale-Invariance and Fuzzy Techniques Explain the Empirical Success of Inverse Distance Weighting and of Dual Inverse Distance Weighting in Geosciences", Proceedings of the Annual Conference of the North American Fuzzy Information Processing Society NAFIPS'2020, Redmond, Washington, August 20-22, 2020, pp. 379-390. pdf file
Vladik Kreinovich and Olga Kosheleva, "Optimization under uncertainty explains empirical success of deep learning heuristics", In: Panos Pardalos, Varvara Rasskazova, and Michael N. Vrahatis (eds.), Black Box Optimization, Machine Learning and No-Free Lunch Theorems, Springer, Cham, Switzerland, 2021, pp. 195-220. pdf file
Vladik Kreinovich, Anh H. Ly, Olga Kosheleva, and Songsak Sriboonchitta, "Efficient Parameter-Estimating Algorithms for Symmetry-Motivated Models: Econometrics and Beyond", In: Ly H. Anh, Le Si Dong, Vladik Kreinovich, and Nguyen Ngoc Thach (eds.), Econometrics for Financial Applications, Springer Verlag, Cham, Switzerland, 2018, pp. 134-145. pdf file
Edgar Daniel Rodriguez Velasquez, Vladik Kreinovich, and Olga Kosheleva, "Invariance-Based Approach: General Methods and Pavement Engineering Case Study", International Journal of General Systems, 2021, DOI: 10.1080/03081079.2021.1953005. pdf file
Julio Urenda, Manuel Hernandez, Natalia Villanueva-Rosales, and Vladik Kreinovich, "How User Ratings Change with Time: Theoretical Explanation of an Empirical Formula", Proceedings of the Annual Conference of the North American Fuzzy Information Processing Society NAFIPS'2020, Redmond, Washington, August 20-22, 2020, pp. 427-432. pdf file