S Kushwaha, P Chauhan, M Shakya
bioprospecting, comparative phylogenetics analysis, metabolic pathways, valuable compounds
S Kushwaha, P Chauhan, M Shakya. Comparative Phylogenetics Approach for Discovering Alternative Source of Taxol. The Internet Journal of Bioengineering. 2008 Volume 3 Number 2.
Bioprospecting is one of the prominent areas of research of commercial and valuable compounds, as it provides alternative sources of these compounds. One of the most persuasive methods for bioprospecting is through molecular phylogenetics analysis. Enzymes involved in the biosynthetic pathway of these compounds are considered as a base for bioprospecting. In this paper, an attempt has been made to find alternative sources for valuable compound taxol by comparative phylogenetics analysis using enzymes physiochemical data and sequence data. Dendogram was generated through physiochemical data whereas phylogenetic tree was generated through sequence data. Consequently after comparison four different organisms,
Exploration of useful application, method, or product in nature is called bioprospecting (Biodiversity Prospecting). Bioprospecting is one of the prominent areas of research of commercial important compounds, which is not only providing alternative source for these compounds but also improving the quality of products and their cost effective extraction. One of the most persuasive methods for bioprospecting is through molecular phylogenetics analysis . Phylogenetic analysis provides new horizons for this type of study and shows their worthwhile by giving answers of their origin, development and other characteristics that forms the base for bioprospecting. Here, an attempt has been made to find alternative sources for valuable compound taxol. Taxol (
In the past, morphological data was used for inferring phylogenies. However, the abundance of DNA/Protein sequence data currently available from a variety of organisms has led to phylogenetics analysis . For bioprospecting, Enzymes involved in the biosynthetic pathway of the compounds act as mapping units. Questions concerning enzyme function and performance remain unanswered with molecular data. However these questions can be answered by analysing physiochemical data of these enzymes. Physiochemical properties and their similarity level can, best explore qualitative measurement of these enzymes. Studies in this direction focusing on individual pathways  or on the entire metabolic repertoire  have been attempted. In the present work, an attempt has been made to find alternate source for taxol by analyzing taxol biosynthesis pathway.
Taxol biosynthesis in Taxus
Except for a few undefined steps, the complete taxol biosynthetic pathway was elucidated and many genes encoding certain enzymes, which regulate taxol biosynthesis pathway, have been cloned and characterized (Table-1). Enzymes involved in the biosynthesis of taxol belong to diterpenoid biosynthesis pathways.
In the present work, comparative phylogenetic analysis using protein sequence and physiochemical data of all the l3 enzymes involved in the biosynthetic pathway of taxol is performed to determine the alternative source for taxol. Enzymes of the Source that gave same results with physiochemical data and with sequence data were proposed to be the most prominent alternate source for taxol biosynthesis.
Materials and methods
Performed the BLAST search for taxol biosynthetic metabolic enzymes
Nucleotide and protein sequences were retrieved from the Genebank accession Numbers, which were then subjected to a BLASTp  search against the non-redundant database with the
Performed data preparation and characterization for phylogenetics analysis
For phylogenetic analysis, enzymes sequence data and physiochemical data was collected, processed and analyzed.
Enzymes physiochemical property such as No. of AAs, No. of Atoms, Molecular weight, Iso-electric point, Positive charge residues(Asp + Glu), Negatively charge residues(Arg + Lys), Aliphatic index, Instability index, Hydropathocity , and Extinction coefficient were obtained from the protein sequence of each enzyme when subjected to ExPASy tool, ProtParam.
Sequence based data
In order to prepare data for phylogenetic tree analysis, entire protein sequence of each enzyme was subjected to site based alignment. Constant sites(C), Variable sites (V) and Singleton sites(S) were analyzed with the help of MEGA4.0 software . Then overall similarity of the alignments was calculated by the Feature similarity score (FSS).
Phylogenetics Tree and Dendogram generation
Dendogram generation from physiochemical data.
Steps used for dendogram generation through physiochemical parameters are:
Calculation and collection of physiochemical properties for each enzyme and their homologous by ExPASy tool, ProtParam.
Choose distance measurement and linkage method for the physiochemical data, for clustering.
Generate dendogram through statistical software Minitab.
Tree generation from sequence based data.
For phylogenetic tree construction, one of the most important tasks is method selection, which depends upon the nature of data . Steps adopted for method selection  are:
Build alignment and check family of sequences.
If data has high similarity (high FSS) i.e. greater than 75%, use character based method (MP).
If data has moderate similarity (medium FSS) i.e. less than 75% and more than 50%, use distance based method (NJ, UPGMA).
If data has lower similarity (low FSS) i.e. less than 50%, use ML method
* After observation of data, Cut off (%) is planned.
Results and Discussion
BLAST Similarity search of enzymes involved in the pathway provides information about other sources that contains the enzymes and thus can be an alternative source for the taxol biosynthesis.
Similarly, physiochemical data was collected for other twelve enzymes and dendrogram was generated from these data for each enzyme by using statistical software Minitab.
Out of 13 enzymes, enzymes with higher FSS are zero, moderate FSS are
GGPS is found in
Transferases are found in
Hydroxylases are found in
Three type of groups were identified from comparative result of phylogenetics tree and physiochemical dendogram, less supporting (Enzymes-2 and 3),moderately supporting(Enzymes-10,11,and 12) and well supporting (Enzymes-1,4,5,6,7,8,9 and 13) shown in table-4. One phylogenetic tree and dendogram belonging to each group is shown in (figure-1, 2 and 3).
In the present work, a comparative phylogenetic analysis of 13 enzymes involved in the biosynthetic pathway of taxol was conducted, consecutively to determine its alternate source. Feature similarity score (FSS) calculated for molecular data, whereas distance measurement and linkage method was studied for dendogram generation. Three type of groups were identified from comparative results of phylogenetics tree and physiochemical dendogram, among them two are less supporting (Enzymes-2 and 3), three are moderately supporting (Enzymes-10, 11and 12) and eight are well supported (Enzymes-1, 4, 5, 6,7,8,9 and 13). Four different organisms,
The present model for alternative sources can be further extended with some modifications, if necessary for analysis of other varieties of valuable compound. The work used here is in great demand, as is not only providing alternative sources for these valuable compounds but also provides useful information about the quality of products. This work will restrict the search area of the scientists working for bioprospecting. This knowledge will contribute positively to bioprospecting for new sources of valuable compounds.
We are grateful to Department of Bioinformatics, MANIT, Bhopal, India for support and cooperation.