BLAST comparison of closely related Haemophilus genomes

In the BLAST comparisons presented here, each protein in the Haemophilus ducreyi proteome is used as a query against the entire Haemophilus influenzae proteome and vice versa to find similar proteins in the compared genome. For each protein, the best hit, or the most similar match, to a protein in the other proteome is recorded. "Best hit" depends on the E-value, a measure of the probability that the observed level of similarity between two compared proteins could be due to chance alone. Thus, an E-value approaching 0 means there is "zero probability" that the protein's goodness-of-match to another protein can be attributed to chance. Examples of matches with excellent and marginal E-values are here.

Unique genes

Unique genes have no significant similarity to genes in the compared genome, as determined by the E-value. If the E-value of the best hit is greater than 0.0001, the query protein is considered unique.  Comparisons depicting unique genes are significant because they reveal the genes that are likely to be responsible for the biology, virulence, and pathogenicity unique to the bacterium (Kalman et al., Nature Genetics 21:385-389, 1999). In addition, this analysis suggests how closely the two genomes are phylogenetically related; a low proportion of unique genes suggests a close phylogenetic relationship. In rare instances, however, homologs can produce blast E-values less than 0.0001.

We have tabulated the unique proteins in each of the two Haemophilus proteomes:

Orthologous genes

W-H Li in his book Molecular Evolution (Sinauer Associates, Inc. Sunderland, Massachusetts) gives a succinct definition of orthologous and paralogous genes: "Two genes are said to be paralogous if they are derived from a duplication event, but orthologous if they are derived from a speciation event." Further details here.

Determining orthology is also significant in assessing the relationship between two genomes. Revealing which orthologous regions are conserved throughout evolution suggests the significance of those regions to the survival of the bacterium (Siefert et al., J Mol Evol 45:467-472, 1997). This type of analysis helps to resolve what sort of changes have occurred in one genome relative to the other throughout evolution, and also suggests the phylogenetic relationship between the bacteria. For example, a high proportion of orthologs suggests a close phylogenetic relationship (Watanabe et al., J Mol Evol 44(Suppl 1):S57-S64, 1996).

We have tabulated the common proteins in both of the two Haemophilus proteomes:

View a map of orthologous proteins in the Haemophilus ducreyi and Haemophilus influenzae proteomes.

Distributon of unique and orthologous (common) genes based on the COG functional categories