Automation of Unit Testing

Summary

Program testing is expensive and labor-intensive, often consuming more than half of the total development costs, and yet it is frequently not done well and the results are not always satisfactory. However, testing is the primary method to ensure that programs comply with requirements. The aim of this work is to completely automate unit testing of object-oriented programs. We investigate the use of an evolutionary approach, such as genetic algorithms, for test data generation and the use of program specifications, written in JML, for test result determination. A prototype tool will be developed to show that a complete automation is not only feasible but also effective for unit testing Java programs. Automated testing techniques such as ours can complement manual testing by testing significant portion of object-oriented programs, as methods in object-oriented programs tend to be small; manual testing can focus more interesting problems, e.g., inter-class testing.

Approach

The ultimate goal of this work is to completely automate unit testing of object-oriented program. A programmer should be able to perform unit testing by a single click of button or a single command execution.

A complete automation of program testing involves automating three components of testing: test data selection, test oracle, and test execution. The essence of our approach is to combine JML and genetic algorithms (see Figure 1). In addition, JUnit is used as a test execution platform. JUnit is a open-source unit testing framework for Java and provides a way to organize test data and perform test execution. In JUnit, one has to write Java code, called a test class, that describes test data, invokes the methods to be tested, and determines test results.

Apporach
Figure 1. Automation of unit testing with JML, JUnit, and genetic algorithms

JML is used both as a tool for describing test oracles and as a basis for generating test data. Our approach uses specifications as test oracles, and JML is used to write such specifications. Each class to be tested is assumed to be annotated with JML assertions, such as pre and postconditions and class invariants that describe the behavior of the class. The JML's runtime assertion checker is used to detect assertions violations at run-time and to interpret them as either test success or failure. For this, a JUnit test class called a test oracle class is generated automatically.

One of important aspects of our work is to automate test data generation. This is done by using genetic algorithms (see Figure 1). An initial set of test data is randomly generated based on the signature information of classes. The fitness value of individual test data is calculated based on either black-box coverage (e.g., condition coverage on postconditions) or white-box coverage (e.g., branch coverage). The set of test data is filtered by using the precondition to eliminate so called meaningless tests. Thus, JML assertions are used both as test oracles and to generate test data. If the set of test data satisfies a preset coverage criterion, a suitable set of test data is found. Otherwise, the test data population is enriched by applying genetic operations such as crossover and mutation. A pair of existing test data is selected and combined to create new test data and some test data are mutated. This evolutionary process is repeated until a suitable set of test data is found. The final set of test data is generated in such a way to be exercised by the JUnit testing framework.

Research Issues

In addition to the engineering challenge of extending, adapting, and integrating different components, such as JML, JUnit, and genetic algorithms, the key research component of our work is to advance the genetic algorithm techniques to generate test data for object-oriented programs. In particular, our study focuses on answering the following questions.

  1. How to encode objects and values as chromosomes and genes? An efficient way need be defined to represent test data as chromosomes. A chromosome is a sequence of genes---e.g., a receiver and arguments for a method call. In object-oriented programs, a gene may be an object or a primitive value. For an object gene, its state may be described as a sequence of statements that need be executed to bring the object to the state of interest. However, the fact that object's state is encapsulated creates a fundamental difficulty. It is not trivial and often impossible to automatically create an object of the desired state.
  2. What genetic operations? We need to define genetic operations and an evolutionary process to create a better set of chromosomes. The well-known genetic operations such as cross-over and mutation may be employed, but care must be taken to ensure that the test data matches the signature of the method being tested. It is not clear, however, whether manipulating objects as bit patterns would be effective.
  3. How to determine fitness of test data? We need to define the fitness of individual test data and the test set as a whole. Our plan is to study both the specification-based coverage and the program-based coverage. In particular, condition coverage on the postcondition seems promising, though we are still evaluating its effectiveness. In terms of tool support, we need to generate coverage information from specifications or source code, by extending the JML's runtime assertion checker or instrumenting source code.

Implementation

An early proof-of-concept tool is available from Yoonsik Cheon's Software Download Page.

Documentation

For Local Developers


Last modified: $Id: utest.html,v 1.5 2008/10/07 05:57:16 cheon Exp $