Abstracts of the
4th Joint UTEP/NMSU Workshop on Mathematics,
Computer Science, and Computational Sciences

University of Texas at El Paso
El Paso, Texas
Saturday, November 8, 2008

Roberto Araiza
(Bionformatics Program, UTEP)

Ribonucleic acid, or RNA, is a nucleic acid consisting of a long sequence of chained nucleotide units. Nucleotides consist of a sugar, a phosphate group, and a nitrogenous base, which in RNA is generally adenine (A), cytosine (C), guanine (G) or uracil (U). RNA plays many important roles within a cell, for example, messenger RNA (mRNA) carries the protein-building blueprint from the DNA template to the ribosomes (the protein-building sites); transfer RNA (tRNA) transfers aminoacids (the protein-building blocks) to the ribosomes; and ribosomal RNA (rRNA), the central component of the ribosome, decodes mRNA and builds the protein by chaining together the aminoacids transferred by tRNA.

To achieve functionality, RNA molecules often require a specific tertiary structure, defined by the three-dimensional atomic coordinates. The scaffold for the tertiary structure is provided by the secondary structure, defined by the hydrogen bonds between nitrogenous bases. In RNA, typically cytosine (C) bonds with guanine (G) and adenine (A) bonds with uracil (U), although other base-pairs are possible.

Most RNA secondary structures are defined by these base-pairings between nucleotides. If we represent an RNA molecule as a string over the alphabet [GUCA], the RNA molecule 'GUCACCCCUGAC' can base-pair positions 1-4 with positions 9-12 and create a stem-loop (where the pairs form the stem, or helix, and the unpaired positions form the loop), a common secondary-structure pattern that is a building-block for more complex secondary structures.

Several approaches to secondary structure prediction consider finding these palindromic inverted repeats, which equate to base-pairs in a molecule. For example, EMBOSS ("European Molecular Biology Open Software Suite") provides a palindrome utility that searches for inverted repeats.

Results from EMBOSS's palindrome utility are correct but sometimes not very concise. For example, for a sequence 'AAAACCCCUUUUUUUUUU', palindrome reports possible base-pairings 1-4:9-12, 1-4:10-13, 1-4:11-14, 1-4:12-15, 1-4:13-16, 1-4:14-17, and 1-4:15-18. If we use intervals to represent this data, we can report instead base-pairs [1, 4]:[9, 18].

The approach can be extended to more complex sequences where the helices consist of more than one base. For example, for a sequence 'CAAAACCCCUUUUUUUUUUG' we have pairings [1-5]:[10-20] with bulges of sizes [0-6]. Bulges arise because stem-halves need not be continuous, for example, we can have, in the above sequence, pairings 1:20 and 2-5:14-17, with positions 18 and 19 being in a bulge. Further constraints, such as reasonable loop and bulge sizes can be developed as well, also using intervals.

Sarala Arunagiri, Patricia J. Teller, Seetharami Seelam, Ron A. Oldfield, Maria Ruiz Varela, and Rolf Riesen
(Presenter from the Department of Computer Science, UTEP)

The massive scale of current and next-generation massively parallel processing (MPP) systems presents significant challenges related to fault tolerance. In particular, the standard approach to fault tolerance, application-directed, periodic checkpointing, puts an incredible strain on the storage system and the interconnection network. This results in overheads that severely impact application performance and scalability. Checkpoint overhead can be reduced by decreasing the checkpoint latency, which is the time to write a checkpoint file, or by increasing the length of the checkpoint interval, which defines both the compute time between checkpoints and the frequency with which checkpoint files are written. However, in the presence of failures, increasing the length of the checkpoint interval may increase application execution time.

In this paper, we analyze a well known mathematical model, developed by researchers at Los Alamos National Laboratory (LANL), of the execution time of applications that perform periodic checkpoints. Via this analysis, we derive properties that demonstrate that the length of the checkpoint interval can be increased beyond the optimal value defined by the model with only a small bounded increase in execution time. While it is clear that increasing the length of the checkpoint interval decreases the frequency of checkpoint file writes, it is not clear if the aggregate number of checkpoint I/O operations decreases. This is because increasing the length of the checkpoint interval increases the execution time, which, in turn, increases the expected number of failures, thereby increasing the expected number of checkpoint file read operations. The paper presents simulation results which show that the number of checkpoint I/O operations indeed decreases as the length of the checkpoint interval increases. In these simulations we have included a few uncertainties that more closely reflect reality and, thereby, relax some of the idealizations of analytical modeling. In addition, the simulation results establish a need for additional analytical modeling work related to the aggregate number of checkpoint operations. Finally, we describe other possible uses of our work in high performance computing.

Neelabh Baijal and Luc Longpre
(Department of Computer Science, UTEP)

An important challenge with statistical databases is to protect the privacy of an individual when aggregate data is released. Cell suppression is a commonly used technique to this end. This involves the suppression of additional non-sensitive data to restrict inferences about the sensitive data. We suggest another approach which we call cell blurring. Instead of completely suppressing some cells, we can replace the value in a cell by an interval. This has the advantage of distributing the uncertainty introduced by the suppression more evenly across the entire data. We propose a number of algorithms that, given a statistical database, a table to be published, some cells in the database that are considered sensitive and a degree of privacy that needs to be guaranteed on the sensitive cells, and computes a new table of intervals that may be published while preserving privacy.

Martine Ceberio and Vladik Kreinovich
(Department of Computer Science, UTEP)

The main objective of our talk is to show that tensor can help in computing - and to explain how.

Let us first recall what are tensors, what is their origin in physics, and why they can be helpful in computing. Physics starts with measuring and describing the values of different physical quantities -- and goes on to equations which enable us to predict the values of these quantities. A measuring instrument usually returns a single numerical value. For some physical quantities (like mass m), the single measured value is sufficient to describe the quantity. For other quantities, we need several values: e.g., to describe the electric field E at a given point, we need to know the values of three components of this field Ex, Ex, and Ez corresponding to three different coordinate axes. To know the characteristics related to tension inside a solid body, we need to know even more parameters, etc. In the 19th century, the equations of physics (e.g., Maxwell's equations) contained a separate equation for each component of the field. These equations were cumbersom to describe and difficult to solve.

Physical equations -- and the resulting computations -- became very much simplified when it turned out that we can describe all the components of a physical field as a single mathematical object -- a vector ai, or, more generally, a tensor aij, aijk, ... Originally, mostly vectors (rank-1 tensors) were used, but in the 20th century quantum physics brought in matrices (rank-2 tensors), and relativity theory brought in even higher-order tensors such as the rank-4 curvature tensor.

In many applications of computing, we are now facing a problem similar to what physics faced in the 19th century: that we need to store and process large amounts of data. We show that in many such situations, tensors can help.

Our optimism is justified by the fact that in many of these situations, there are already heuristic techniques that help -- techniques which are, in effect, tensor-related. For example, the standard way of multiplying two matrices C = AB is very inefficient for large matrices A and B which cannot be placed into the fast-to-access part of the computer memory ("cache") - because this way requires a lot of time-consuming data transfers ("cache misses") between different memory parts. To speed up computations, practicionere represent each large matrix A as a "matrix" of block matrices Aij, and use the fact that matrix multiplication can be represented in terms of these blocks Cik = Ai1B1j + ... + AinBnj. In effect, this means that each element of the original matrix is now represented as an (a,b)-th element of a block Aij, i.e., as an element of a rank-4 tensor (Aij)ab. (The fact that an increase in rank improves efficiency should not be surprizing: a representation of a rank-1 vector as a rank-2 spinor works in relativistic quantum physics.)

Tensors can be used to naturally describe constraints. Indeed, a general constraint between n real-valued quantities is a subset of Rn, and such a subset can be (approximately) described block-by-block - i.e., as a boolean-valued tensor. It turns out that processing such constraint-related sets can also be naturally described in tensor terms.

Tensors also naturally appear in an efficient Taylor series approach to uncertainty propagation. Here, the resulting tensors are symmetric, so have a special computational challenge of decreasing the r! duplication of the standard computer representation. Tensors naturally appear in quantum computing since in quantum physics, the set of state of a composite system is a tensor product of the corresponding sets of states.

So far, we have shown that tensors can help computing, but the relation between tensors and computing can also help physics. For example, to describe Kaluza-Klein-type high-dimensional space-time models of modern physics, Einstein himself pioneered to use "tensors" with integer or circular values; from the mathematical viewpoint, such "tensors" are unusual, but in computer terms, integer or circular data types are very natural - and even more efficient to process than standard real numbers.

Manali Chakraborty and Olac Fuentes
(Department of Computer Science, UTEP)

In this paper we propose a method for automating the process of detecting regions of motion in a video sequence in real time. The main idea of this work is to detect motion based on both structure and color. The detection using structure is carried out with the aid of information gathered from the Census Transform, a method that characterizes the local intensity patterns in an image region. Color-based detection is done using color histograms, which allow efficient characterization without prior assumptions about color distribution in the scene. The probabilities from the Census Transform and Color histogram are combined in a robust way to detect the zones of active motion.

We demonstrate the effectiveness of our approach by showing initial results of its application as an initial step to detect regions of interest in a foveal vision system. In this foveal system, high resolution is applied to regions where motion is detected, while the rest of the image is sensed, transmitted and stored in low-resolution.

Behzad Djafari-Rouhani
(Department of Mathematical Sciences, UTEP)

abstract in pdf

Art Duval
(Department of Mathematical Sciences, UTEP)

Cayley's famous result that there are n^(n-2) spanning trees in a complete graph of n vertices has been proved, and generalized, in many ways. One particularly nice way is the Matrix- Tree Theorem, that the number of spanning trees of an arbitrary graph can be enumerated by the determinant of a reduced Laplacian matrix.

Building upon the work of Kalai and Adin, we extend the concept of a spanning tree from graphs to simplicial complexes, which are just higher-dimensional analogs of graphs. For all complexes K satisfying a mild technical condition, we show that the simplicial spanning trees of K can be enumerated using its Laplacian matrices, generalizing the Matrix-Tree theorem. As in the graphic case, replacing the Laplacian with a weighted analogue yields extra information about the simplicial spanning trees being counted. As an example, we find a nice expression for the resulting weighted tree enumerator of shifted complexes, by generalizing a formula for the Laplacian eigenvalues of a shifted complex to the weighted case.

This is joint work with Carly Klivans and Jeremy Martin.

John Harding
(Department of Mathematical Sciences, NMSU)

We give a brief overview of quantum logic and the recently introduced subject of categorical quantum mechanics. We outline a recently discovered link between the two subjects, and discuss several examples. One example also has a link to coding theory.

Naga Suman Kanagala, Edgar Padilla, Essau Ramirez, Angel Silva, Cesar Valenzuela, and Martine Ceberio
(Department of Computer Science, UTEP)

Many neurological diseases, such as cerebral palsy in children, strokes or Parkinson's disease, can cause abnormal changes in gait or "gait disorders" and significantly affect the ability of patients to walk without assistance. Intensive rehabilitation therapy is usually the key for patients to improve their walking ability. However, therapy is usually long and relies on human experts to perform it: it involves a lot of repetition of exercises guided by therapists. As a result, it is not as efficient as one would expect and it is very costly. The key to improve rehabilitation is to automate the diagnosis of gait pathologies and as well as the therapy.

Automatically diagnosing gait pathologies requires to be able to assess gait. Nowadays, gait is assessed through observation and/or computer-based analysis of some characteristics; e.g., the angle between the shinbone and the femur, the height of the ankle (using markers placed on the patient's joints: typically ankle, knee, hip). Current diagnosis techniques are based on observations; current research on automating the gait diagnosis makes use of video and data recordings. Current research work is mostly oriented towards the use of fuzzy logic or neural networks, in order to identify features that will help differentiate between healthy and unhealthy gaits. Since it is also expected that diagnosis will involve comparing observed gait to known healthy gaits, current research is also concerned with being able to recognize scaling and shifting, so that comparison can be performed properly. However, although important, the problem of inconsistencies in the gait cycles of a same patient (e.g., patients walking faster, slower, patients having a very distorted gait) has not been addressed yet, and neither the need for determining models (of both healthy and unhealthy patients).

In this work, we focus on the diagnosis part of the problem. Our work makes use of optimization and constraint programming techniques, aiming at extracting characteristics (models) and at being independent of inconsistencies in the cycles / patients (e.g., height, weight, age) / speed. In particular, we propose a new technique to automatically detect gait cycles, and to determine whether or not a gait is unhealthy.

The work we describe in this abstract was performed on data recorded from healthy patients, and obtained from markers placed on their ankle, knee, and hip joints. We developed a simple filtering algorithm that takes into account online significant points among the recorded data, and that allows to exhibit a pattern of a normal gait. From the result of this algorithm, we were also able to isolate cycles and correlate them between data from different markers, in order to validate our patterns. The patterns were then used to define constraint solving problems: this step allows us to check if a gait is not healthy, i.e., does not match the expected pattern.

The novelty of this work for gait analysis lies on the fact that we are interested only in patterns, which leads us to a qualitative assessment of the gait, and not so much in a quantitative assessment that, in our opinion, relies too much on the data being perfect (no inconsistency).

Preliminary results show the promise of our work, and the developed tool can be extended to support the detection of more inconsistencies in the recordings than the ones experimented and reported in this article.

Amine Khamsi
(Department of Mathematical Sciences, UTEP)

In this work, we give a characterization of the existence of minimal elements in partially ordered sets in terms of fixed point of multivalued maps. This characterization shows that the assumptions in Caristi's fixed point theorem can, a priori, be weakened. Finally, we discuss Kirk's problem on an extension of Caristi's theorem and prove a new positive result which illustrates the weakening mentioned before.

Jaime Nava, Vladik Kreinovich, Guillermo Restrepo, and Douglas J. Klein
(presenter from Department of Computer Science, UTEP)

In many applications, we have numerous molecules that are obtained from a "template" molecule like benzene C6H6 or cubane C8H8 by replacing some of its hydrogen atoms with other atoms or atom groups (called ligands). Depending on how many original atoms are replaced and which ones are replaced, we get a large number of different chemical substances. It is desirable to be able, based on the measurements performed on a small number of such substances, to accurately predict the characteristics (such as energy) of all similar substances.

Such predictions are very important, since, e.g., cubanes, while kinetically stable, are highly reactive. As a result, at present, they are actively used as high-density, high-energy fuels and explosives, and researchers are investigating the potential of using cubanes (and similarly high-energy molecules) in medicine and nanotechnology.

In their 2007 paper in the Journal of Mathematical Chemistry, Teodora Ivanciuc and Douglas J. Klein use complex poset-related ideas of Gian-Carlo Rota to make such predictions. In our talk, we show that similar predictions can be made by using much simpler Taylor series techniques.

Inna Pivkina
(Department of Computer Science, NMSU)

Curricular materials in discrete mathematics, algorithm design, automata, graph theory, and logic have been developed by a team consisting of Guram Bezhanishvili, Hing Leung, Jerry Lodder, David Pengelley, Inna Pivkina, Desh Ranjan from New Mexico State University, and Janet Barnett from Colorado State University - Pueblo. The materials (projects) are based on primary historical sources. Using original historical sources provides context and direction to subject matter, honing the students' verbal and deductive skills through reading the work of some of the greatest minds in history. Classroom testing of the projects shows improved student attitude and improved performance after completing a course with historical projects. More information is available at http://www.cs.nmsu.edu/historical-projects/

Andrzej Pownuk
(Department of Mathematical Sciences, UTEP)

Automatic grading of students' homework improve quality of teaching at the university. There are many commercial companies which provide on-line web application for doing that (Webassign, MyMathLab etc.). It is also possible to get some free software (e.g. WeBWorK).

In this presentation, the author would like to show an example of an on-line grading system, which was written in asp.net (Web Forms), C#, and MSSQL Express. The structure of the system can be very easily customized in order to satisfy the needs of each particular subject. That feature is particularly important in teaching of graduate courses.

Homework can be created using C#, VB, and LaTeX.

In order to efficiently grade on-line homework, it is necessary to estimate the quality of students' answers. That process require special mathematical tools. The grade is a measure which tell the teacher how well the knowledge of the student is close to the knowledge which is necessary in order to pass each particular test. Because of that the process of online grading can be described by using the fuzzy sets theory and the theory of imprecise probability.

The system which automatically grades students' homework has the following internet address

Karen Villaverde and Gilbert Ornelas
(Department of Computer Science, NMSU, and Department of Computer Science, UTEP)

One of the main tasks of science and engineering is to use the current values of the physical quantities for predicting the future values of the desired quantities. Due to the (inevitable) measurement inaccuracy, we usually know the current values of the physical quantities with interval uncertainty. Traditionally, it is assumed that all the processes are continuous; as a result, the range of possible values of the future quantities is also known with interval uncertainty. However, in many practical situations (such as phase transitions), the dependence of the future values on the current ones becomes discontinuous. We show that in such cases, initial interval uncertainties can lead to arbitrary bounded closed ranges of possible values of the future quantities. We also show that the possibility of such a discontinuity may drastically increase the computational complexity of the corresponding range prediction problem.

Piotr Wojciechowski
(Department of Mathematical Sciences, UTEP)

abstract in pdf

John Harding and Qin Yang
(Department of Mathematical Sciences, NMSU)

abstract in pdf

Jun Zheng and Olac Fuentes
(Department of Computer Science, UTEP)

A Stochastic Method for Resolution-enhancement of Face Images Jun Zheng and Olac Fuentes When provided with high-resolution, occlusion-free images of faces in frontal view, automated face recognition systems can provide accuracies that are comparable to those obtained by humans. However, in surveillance applications, cameras are usually set up with wide-fields of view in order to capture as much of the scene as possible. This normally results in low resolution images of the objects of interest. A new family of approaches, based on statistical machine learning, aims at analyzing large data sets of images of a particular class, for example faces, and learning the mapping from low-quality to high-quality images of that class. This enables them to infer, for example, the most likely high-resolution face image depicting the same person as a low-resolution image given as input. These resolution enhancement algorithms are time-consuming, due to the need for exhaustive search in a database of models. This work improves the efficiency of face hallucination using stochastic search in local modeling. Experimental results show that the proposed algorithm can generate high-quality face images from low-resolution input while reducing the computational time dramatically.