Quality assurance teams in software development organizations use different techniques in order to improve software reliability. Techniques commonly applied are reviews or inspections of design documents and code, additional testing, and strategic assignment of experienced developers and maintainers to different programming tasks. However, due to limited time and resources and the high cost of the verification phase, it is often not feasible to ensure perfect reliability of the entire set of software modules. This thesis aims at answering if we can help the quality assurance team to focus their verification efforts on the parts of the system that are most likely to contain faults. We hereby consider the software engineering hypothesis related to the Pareto Principle of distribution of faults. In line with this principle, and as validated in existing research, we assume that a large portion of the faults in software applications are located in a small percentage of the software components.
The empirical study described in this thesis uses historical product and process data collected from an object-oriented evolving, legacy system written in Java for the construction and validation of fault-proneness prediction models. Applying data mining techniques such as neural networks and decision trees on the available data, we built fault-proneness prediction models, which estimate the probability that a class will contain a fault in the next release. To avoid over-fitting and to get realistic estimates of model prediction accuracy for future system releases, we built the models using both 20-percent holdback and cross-validation procedures. Furthermore, on the basis of a cost-effectiveness analysis, we showed that by using measures of structural properties of the code combined with change and fault data from previous releases, we were able to build practically useful prediction models. For example, on average across the studied releases, we estimated that the best prediction model can find 63 percent of the total number of faults in the 18 percent of the classes that were predicted to be the most fault-prone. These fault-prone classes corresponded to 29 percent of the total number of lines of code in the system. By focusing or prioritizing the testing activities on this relatively small subset of the total code base one may thus find faults faster or with fewer resources.
On the basis of the promising results, the company we collaborated with has decided to use the technology developed as part of this thesis during their next system release. This will allow an even more thorough evaluation of the cost and the benefits of using the prediction models as a decision-making tool to improve the efficiency of the quality assurance activities of software development.