BLAST comparison of closely related Chlamydia genomes

In the BLAST comparisons presented here, each protein in the Chlamydia trachomatis proteome is used as a query against the entire Chlamydia pneumoniae proteome and vice versa to find similar proteins in the compared genome. For each protein, the best hit, or the most similar match, to a protein in the other proteome is recorded. "Best hit" depends on the E-value, a measure of the probability that the observed level of similarity between two compared proteins could be due to chance alone. Thus, an E-value approaching 0 means there is "zero probability" that the protein's goodness-of-match to another protein can be attributed to chance. Examples of matches with excellent and marginal E-values are here.

Unique genes

Unique genes have no significant similarity to genes in the compared genome, as determined by the E-value. If the E-value of the best hit is greater than 0.0001, the query protein is considered unique.  Comparisons depicting unique genes are significant because they reveal the genes that are likely to be responsible for the biology, virulence, and pathogenicity unique to the bacterium (Kalman et al., Nature Genetics 21:385-389, 1999). In addition, this analysis suggests how closely the two genomes are phylogenetically related; a low proportion of unique genes suggests a close phylogenetic relationship. In rare instances, however, homologs can produce blast E-values less than 0.0001.

We have tabulated the unique proteins in each of the two Chlamydia proteomes:

C.trachomatis vs C.trachomatis

C.pneumoniae vs C.pneumoniae

C.trachomatis vs C.pneumoniae

C.pneumoniae vs C.trachomatis

The unique genes among C.trachomatis serovar D and C.pneumoniae CWL029 are also depicted in a Map of Chlamydia unique proteins.

Orthologous genes

W-H Li in his book Molecular Evolution (Sinauer Associates, Inc. Sunderland, Massachusetts) gives a succinct definition of orthologous and paralogous genes: "Two genes are said to be paralogous if they are derived from a duplication event, but orthologous if they are derived from a speciation event." Further details here.

Determining orthology is also significant in assessing the relationship between two genomes. Revealing which orthologous regions are conserved throughout evolution suggests the significance of those regions to the survival of the bacterium (Siefert et al., J Mol Evol 45:467-472, 1997). This type of analysis helps to resolve what sort of changes have occurred in one genome relative to the other throughout evolution, and also suggests the phylogenetic relationship between the bacteria. For example, a high proportion of orthologs suggests a close phylogenetic relationship (Watanabe et al., J Mol Evol 44(Suppl 1):S57-S64, 1996).

We have tabulated the common proteins in both of the two Chlamydia proteomes:

C.trachomatis vs C.trachomatis

C.pneumoniae vs C.pneumoniae

C.trachomatis vs C.pneumoniae

View a map of orthologous proteins in the C. pneumoniae CWL029 and C. trachomatis serovar D proteomes.

Distributon of unique and orthologous (common) genes based on the COG functional categories

Map of highly conserved regions

There are several regions of bacterial genomes which are highly conserved. Comparisons of orthologous tRNA, rRNA, and the S10 region serve as anchors for the relative positions of genes in the genomes (Siefert et al., J Mol Evol 45:467-472, 1997). Plotting these regions helps to resolve what events may have occurred to give rise to the current positioning of orthologs in one genome relative to the other.

Link to tRNA, rRNA, and S10 map