Predicting the function of a hypothetical protein from Pyrococcus horikoshii OT3 as HD domain containing metal-dependent phosphohydrolase
P Babu, K Harshita, V prasanth, S Chitti
blastp, hd domain, hypothetical protein, metal-dependent phosphohydrolase
P Babu, K Harshita, V prasanth, S Chitti. Predicting the function of a hypothetical protein from Pyrococcus horikoshii OT3 as HD domain containing metal-dependent phosphohydrolase. The Internet Journal of Genomics and Proteomics. 2009 Volume 5 Number 2.
2CQZ, a hypothetical protein from
Many small bacterial, archaebacterial and larger eukaryotic genomes are currently being sequenced. In all genomes sequenced to date, a large portion of these organisms’ protein coding regions encodes polypeptides of unknown biochemical, biophysical, and/or cellular functions. In biochemistry, a hypothetical protein is a protein whose existence has been predicted, but for which there is no experimental evidence that it is expressed
The function of a hypothetical protein can be predicted by domain homology searches with various confidence levels and assigning the molecular function of a protein with unknown function starts with determining the three-dimensional structure of the protein by either X-ray crystallography or NMR. The structural sequence is then compared against those of the protein structure database (Protein Data Bank). If there are one or more significant structural homologs, the hypothetical protein will have molecular properties similar to the homologs (Fields
In this paper we address the prediction of domain regions in a hypothetical protein, 2CQZ, from
Materials and Methods
PDB (Protein Data Bank) (http://www.rcsb.org/pdb) was searched for hypothetical proteins of unknown function and 2CQZ protein (190 residues length) of
Pair wise Alignment
2CQZ sequence in FASTA format was extracted from PDB. Similarity search was carried out by scanning the sequence against non-redundant database of NCBI (National Centre for Biotechnology Information) (http://www.ncbi.nlm.nih.gov) using protein-protein BLAST pair wise alignment tool, BLASTp (Altschul
Multiple sequence alignments of similar proteins resulted from blast analysis was performed by MAFFT (Multiple Alignment using Fast Fourier Transform) using default parameters. Multiple alignments are carried out to identify the regions of residue conservation among homologous proteins (Yamada
Results and Discussion
BLASTp search analysis of 2CQZ protein sequence against non-redundant database resulted in many hits with varying degrees of percentage identities and similarities. Reported in Table 1 are the results of the analysis, where it can be observed that the oxetanocin-like protein (NP_578124.1), metal-dependent phosphohydrolase (YP_182427.1), HD domain (NP_587821.1), HD domain containing protein 2 (XP_001315286.1), showed similarity ranging from 63-32% and the remaining hits obtained are with other hypothetical proteins. From Table 1, it can be understood that 2CQZ would represent a metal-dependent phosphohydrolase and may contain HD domain, respectively. Therefore multiple alignments are constructed with each protein family versus 2CQZ protein to provide clarity towards probable relational aspects of 2CQZ.
From Table 1 it is evident that the number of residues identical with 2CQZ varied considerably, and moreover, both the metal-dependent phosphohydrolase and HD domain containing proteins matched with our protein (based on the number of entries). A search in NCBI nucleotide database for oxetanocin-like protein revealed one entry on complete genome of
HD domain is found in a superfamily of enzymes either with predicted or known phosphohydrolase activity. These enzymes appear to be involved in the phosphatase or phosphodiesterase activities and accordingly play a role in signal transduction and possibly other functions in bacteria, archaea and eukaryotes. The HD superfamily is reported to possess highly conserved key metal-binding residues, histidines or aspartates, essential for the activity of these proteins (Aravind
Therefore, from the above data and from Pfam entry PF01966, (http://pfam.sanger.ac.uk/family? acc=PF01966) it is observed that oxetanocin-like protein has HD domain and are metal-dependent phosphohydrolases. As 2CQZ showed highest similarity with oxetanocin-like protein and moderate similarities with metal-dependent phosphohydrolases and HD domain proteins, it is believed that 2CQZ should also contain a HD domain motif and hence further investigation was directed to find HD domain similarities.
It has been reported by Zimmerman et al (2008), that