CS 2401 Assignment #3

Due Date: Sunday, September 20, 2009.

Objective: The goal of this assignment is to practice recursion.

Motivation. All the genetic information about every living creature is stored in this creature's DNA. A DNA is a long sequence of "code letters" T, C, A, G; e.g., it may have the form TCAAGTG... Whether a creature will have a tail or blue eyes is determined by parts of the DNA called genes.

In a DNA sequence, there are no blank spaces or other separators; so, how does nature determine when a gene starts and when it ends? One way is by looping. Looping is based on the fact that some letters are attracted to each other: T is attracted to A, and C is attracted to G. So, they easily form pairs: T--A and C--G. Thus, if we have a sequence like AACGT at the beginning of the gene and an "opposite" sequence ACGTT at the end:

AACGT --------------------------- ACGTT
then these code letters will stick to each other: the last letter T of the beginning sequence sticks to the first letter A of the ending sequence, the last-but-one letter G of the beginning sequence sticks to the second letter C of the ending sequence, etc.
     ------
     \    /
      T--A
      |  |
      G--C
      |  |
      C--G
      |  |
      A--T
      |  |
      A--T
As a result, the whole gene between the beginning and the ending sequences is encompassed into a loop: In biology, a sequence flanked by such pairs (like AACGT and ACGTT) is called a bio-palindrome. It is important to mention that there is an important difference between these biological "palindromes" and traditional computer science palindromes like acbca: Because of the biological relevance, the problem of detecting such "bio-palindromes" is important in computational biology.

In this lab, we will analyze a simplified ("toy") version of this problem, when there are no letters between the beginning and the ending sequences. For example, a sequence AACGTACGTT is a bio-palindrome in this sense.

Assignment. Write a recursive program that, given a DNA sequence, checks whether this sequence is a bio-palindrome.

For extra credit: