CLASS TIME: TR 12-1:20 pm
INSTRUCTOR: Vladik Kreinovich, email vladik@utep.edu, office CCSB 3.0404, office phone (915) 747-6951.
PREREQUISITES:
Introduction: formulation of the problems. The main purpose of computers is to process real-life data, so that we will able:
In some situations -- e.g., in basic mechanics -- we have fundamental from-first-principles laws that enable us to make the corresponding predictions. However, in many other situations, especially in engineering, we only have approximate empirical formulas. For example, it is not possible to predict, based on the first principles, how pavement will deteriorate with time, or how people will change their opinions about goods.
In such situations, we face the following problems:
Similarly, in many engineering applications, there are semi-empirical methods for solving the corresponding problems. In such cases, similar problems appear:
First topic: how to solve these problems -- the idea of invariance. A natural way to make predictions is to look what happened in similar situations in the past. And what does "similar" mean? It means that some important features of the current situations are the same (or at least almost the same) as the same important features of the past situation. In other words, there may have been some changes between the two situations, but the important features did not change.
In mathematical terms:
Not surprisingly, symmetries and invariances are, at present, one of most effective tools in theoretical physics and in many other disciplines. In line with this reasoning, in this class, we will study the invariance-based approach to solving the above problems.
Materials: Section 1 of Paper 6.
A simple example. How do we know that if we drop a pen, it will start falling down with the acceleration of 9.81 m/sec^{2}? Because we -- and others -- repeated this experiment in many different locations, at many moments of time, and always observed the same result.
In this case, the current situation can be obtained from the previous one by shifts in space and time, maybe by rotation -- so the observed phenomenon is invariant with respect to all these transformations.
Second topic: what transformations we will study in this class. Some transformations studied in physics are very complicated, such as changing particles to corresponding anti-particles. In this class, we will focus on the simplest and most natural transformations.
These transformation come from the fact that:
For some quantities like temperature or time, there is no fixed starting point, so numerical values also depend on what starting point we choose. For example, the difference between Fahrenheit and Celsius scales for measuring temperature is that these two scaled use different measuring units and different starting points.
Materials: Section 2 of Paper 6.
Third topic: corresponding invariances. In many situations, there is no fixed measuring unit, the choice of a measuring unit is simply a matter of convention. In this case, it is reasonable to assume that the desired empirical formula has the same form no matter what measuring unit we select.
Of course, each formula relates the values of several quantities. So, if we change the measuring unit for one of the quantities, we need to appropriately change the measuring unit for other quantities. As an example, let us consider the formula d = v * t that describes the traveled distance d as a function of velocity v and time t. This formula remains true whether we measure distance in kilometers or in miles. However, for this formula to remain true when we switch from miles to kilometers, we also need to change the units for measuring velocity from miles per hour to kilometers per hour.
Similarly, in some formulas, if we change the starting point for measuring one of the quantities, we may need to appropriately re-scale other quantities.
Materials: Section 2 of Paper 6.
Fourth topic: which dependencies are invariant with respect to these transformations. This is what we will study first: which functions are invariant with respect to changing the measuring unit and changing the starting point. We will show that, in effect, the only invariant dependencies are:
We will also describe engineering situations where such dependencies appear, including:
In all the above cases, x and y are two different -- but related -- quantities. We will also consider a special case when x and y are related values of the same quantity. In this case, the corresponding invariance will explain the appearance of Rectified Linear Unit (ReLU) y = max(0,x) -- the main activation function in deep neural networks.
Materials:
Fifth topic: optimization and how it is related to invariance. Several of the above-described problems are about looking for the best formula or the best method, i.e., about optimization. In situations when we have a symmetry -- i.e., a transformation with respect to which important features are invariant -- it is reasonable to assume that the relative quality of different alternatives (formulas or algorithms) should not change if we simply change the measuring unit and/or the starting point of the corresponding quantity.
We will show that in this case, the optimal alternative should also be invariant. This explains the effectiveness of invariant empirical formulas in engineering -- and, in particular, the effectiveness of ReLU units and of soft-max in deep learning.
Materials: Sections 2, 5, and 7 of Paper 4.
Sixth topic: what next -- possible ideas. As we have mentioned, often, simple invariant transformations provide only an approximation to the actual dependence.
We made the following assumptions:
In practice, all these assumptions may be violated, which leads to four directions in which more complex formulas can appear.
Materials: Section 3 of Paper 6.
Seventh topic: what if the dependence of y on x is indirect. In many practical situations, the dependence on the desired quantity y on the known quantity x is indirect:
In this case, the dependence of y on x is described not by one of the above four formulas describing the direct dependence, but by the composition of such formulas. For example, if y = f(z) and z = g(x), then y = f(g(x)).
We will study such composition functions, and we will describe engineering examples where such compositions are indeed experimentally observed.
Materials:
Eighth topic: what if there are several ways how x affects y. In some situations, there are two or more different ways how x can influence y. We have empirical formulas describing each of these ways -- they usually correspond to specific cases when one of these ways is dominant and other can be ignored.
To describe a general situation, in which all these ways are important, we need to combine the known formulas. We will show how invariance requirements can help select an appropriate combination -- and we will provide engineering examples where the resulting combined formulas work well.
Materials: Sections 3 and 4.6 of Paper 6.
Ninth topic: what if in different situations, the dependence of y on x is different? In such situations, we cannot use a single function, we need to come up with a family of functions -- so that different functions from this family will describe different situations. A natural -- and probably the simplest -- way to describe a family of functions is to consider linear combinations of a few selected functions e_{1}(x), ..., e_{n}(x):
We will study such invariant families. Examples will include a biomedical application: how to describe the growth of a tumor.
Materials:
Tenth topic: what if transformations are more complex than changing the measuring unit and/or the starting point? As we have mentioned, in many situations, the transformations are more complex than changing the measuring unit and/or the starting point. In this course, we will consider the simplest case of such complex transformations, when we transform a single quantity (as opposed to, e.g., rotation in a plane, that transforms the numerical values of both coordinates x and y).
It turns out that this case, the transformations are fractionally linear, i.e., have the form y = (a * x + b) / (c * x + d). We will analyze functions invariant under such transformations, and we will show that this explains the efficiency of sigmoid activation function 1/(1 + exp(-x)) that is used both in shallow and in deep neural networks.
Materials: Sections 3 and 7 of Paper 4.
Projects. The main purpose of all this activity is to help engineering applications. Students are therefore required to work on projects.Projects should be either related to real-life applications, or have a more general theoretical nature. There are two types of projects: a possible project and an ideal project.
A possible project: reviewing a paper. A project may consist of reviewing some related paper(s) -- and presenting this review to the class. It is OK for two or three students to work on the same project.
If you select this project, then you need to first check with the instructor whether the paper is appropriate, to make sure that:
If some part of the paper is not clear, do not hesitate to ask the instructor or -- if this paper is related to your research -- your research supervisor.
In your report, please make sure to concentrate on the following questions:
Please send your report to the instructor a week before the scheduled presentation, so that we will be able, if needed, to clarify possibly not very clear parts of the report.
An ideal project. An ideal project should be creative. In such a project, students -- individually or in groups -- will come up with something new.
How is this possible? Invariance-based approach is a new developing topic. If there are some interesting empirical formulas that have no convincing theoretical explanations:
HOMEWORKS. Each topic means home assignments. After the deadline, I will post correct solution to the corresponding home assignment. Since I will be posting correct solutions to homeworks, it does not make any sense to accept late assignments: once an assignment is posted, it make no sense for you to copy it in your own handwriting, this does not indicate any understanding. So, please try to submit your assignments on time.
Things happen. If there is an emergency situation and you cannot submit it on time, let me know, you will then not be penalized -- and I will come up with a similar but different assignment that you can submit when you become available again.
TESTS. There will be three tests, tentatively on September 22, November 3, and November 22, and the final exam on December 6, 1-3:45 pm. If you are unable to attend the test, let me know, I will organize a different version of the text at a time convenient for you.
GRADES: Maximum number of points:
A good project can help but it cannot completely cover possible deficiencies of knowledge as shown on the test and on the homeworks. In general, up to 80 points come from tests and home assignments. So:
WE WILL OVERCOME. Topics that we study in this class are not easy topics, but hopefully, you will all do well, it is not as difficult as many things you have successfully mastered in your classes so far.
SPECIAL ACCOMMODATIONS: If you have a disability and need special accommodations -- e.g., extra time on the exams -- please contact the Center for Accommodations and Support Services (CASS) at 747-5148 or by email to cass@utep.edu. For additional information, please visit the CASS website at http://www.sa.utep.edu/cass. CASS's staff are the only individuals who can validate and if need be, authorize accommodations for students.
SCHOLASTIC DISHONESTY: Any student who commits an act of scholastic dishonesty is subject to discipline. Scholastic dishonesty includes, but not limited to cheating, plagiarism, collusion, submission for credit of any work or materials that are attributable to another person.
Cheating is:
Collusion is unauthorized collaboration with another person in preparing academic assignments.
Instructors are required to -- and will -- report academic dishonesty and any other violation of the Standards of Conduct to the Dean of Students.
Notes: When in doubt on any of the above, please contact your instructor to check whether you are following an authorized procedure.
MATERIALS USED IN THE CLASS: there is no textbook; instead, we will use the following papers (listed in the usual alphabetic order):