This thesis describes the foundation for developing a tool that compares Java programs, or different versions of a program. The tool captures syntactic differences and contextual semantic differences as well. Syntactic differences are “ordinary” changes in the code. This tool works much in the same way as the Unix tool diff, but it is much smarter than diff. This is because it exploits the fact that programs are built differently than ordinary text. The tool diff’s purpose is to compare text, and it will therefore give imprecise or too verbose results. The tool described in this thesis can identify contextual semantic differences because it knows the contexts of methods, meaning that it knows whether methods are directly declared in the class, inherited from implemented interfaces or if methods override the class’ parent’s method.
The approach in this thesis for comparing Java programs is to transform the programs into abstract syntax trees. The transformation from source code to abstract syntax trees are done with the help Strafunski. Strafunski is a software bundle that supports generic programming. The implementation of the tool is done in Haskell. Haskell is a functional programming language.
The work of comparing abstract syntax trees can be broken down into the problem of finding the largest common subtree of two abstract syntax trees and further more, the problem of finding the longest common subsequence of two sequences. This thesis describes and presents new algorithms for doing this and it also describe working Haskell code of the implementation of the tool.