Genomes are constantly exposed to both internal and external agents causing DNA to get damaged. Lack of sufficient DNA repair might cause defects in development of the cell or even cancer. There exists several DNA repair pathways, and many genes are involved in these processes. These genes may be grouped into families using the same pathway while repairing DNA. In the human genome, there are approximately 150 known DNA repair genes.
A database containing genes involved in DNA repair in different organisms and which of these are homologs to each other would be a useful and helpful tool for scientists. While making mutants in the laboratory in order to characterize a gene, it is important to know what function a certain gene has in other organisms.
The database developed was built on a relational model were Natural language Information Analysis Method (NIAM) and Unified Modeling Language (UML) were used to make the design. The MySQL database system and PHP programming were used extensively to manipulate and retrieve information from the database.
An object role model was made on background of what information was needed about the genes and a class diagram was derived from this model. Wood et al. (2005) has provided an overview (http://www.cgal.icnet.uk/DNA_Repair_Genes.html) of human DNA repair genes and in which mechanism repair genes in the human genome take part; these genes were the first included in the database. Information from The National Centre for Biotechnology Information (NCBI), including the HomoloGene system, was used to find homologs to the genes provided in this overview and also other essential information about all these genes. Some of the information from NCBI was collected manually, but most of it was derived from files on their ftp-sites. The tables in the database were filled with all this information and some temporary tables were also made to retrieve just the information needed from these large files downloaded from the NCBI. A dynamic web-interface was made on top of the database. For this PHP was used embedded with HTML and MySQL.
The result of this work is a relational database with a dynamic interface. The user can pick which of the 18 organisms to include and either select a DNA repair system or do a free text search on a text field. If the search is on DNA repair system, all genes in the system picked will be grouped into orthologs. If the text field is used, the system will search for this string in gene name, synonym and description of the gene. The resulting output will be grouped into homologs here as well. For both choices the output provides links to NCBI’s HomoloGene, Taxonomy Browser and Entrez Gene. The Database of Genes Involved in DNA repair is available at http://dna.uio.no/lise/database/rep.php.