Human Herpesvirus 2 (HSV-2)
INTRODUCTION
Human Herpesvirus 2 is the causative agent for genital herpes and disseminated neonatal herpes.
With lower frequency HSV-2 can be isolated from labial, facial and ocular lesions. The complete genome sequence was published in 1998, "The Genome
Sequence of Herpes Simplex Virus type 2" by Dolan, Jamieson, Cunningham,
Barnett, and McGeoch,
J Virol, March 1998, p. 2010-2021, Vol. 72, No. 3
GENOME STRUCTURE AND COORDINATE SYSTEM
The circular genome has 154,746 bp. HSV-1 and HSV-2 share the same overall structure, as well as very high similarity in most genes. Both HSV-1 and HSV-2 have two unique regions, named UL and US surrounded and defined by three sets of inverted repeats, named LTRa, LTRb and LTRc.
The arrangement is shown below.
The UL and US regions are found both in forward and inverted directions and wild type HSV occurs equally in all four possible arrangements (isomers) of the UL and US regions.
The coordinate system is defined to begin at the first nucleotide of the shortest terminal repeat, LTRa, followed by LTRb, and the long unique region, UL, as shown in the diagram. The number of copies of the LTRa repeat may vary.
There are two origins of replication, OriL located within the UL region centered at 62,929 and OriS within repeat LTRc and LTRc' centered at 132,760 and 148,981.
HSV-2 TO HSV-1 GENOME COMPARISON
The BugSpray comparison below shows genome alignment based on similar genes and dramatically illustrates the similarity of gene arrangement between the HSV-1 and HSV-2 genomes.
BugSpray draws lines from genes on the top genome, in this case HSV-2, to the gene with the best BLAST hit in the bottom genome, here HSV-1.
Green lines indicate pairs of genes that code on the same strand and red lines indicate gene pairs that code on the opposite strand.
Clearly there are no rearrangements or reordering of genes between these two viruses.
The three red lines show best hits of gamma 34.5, alpha4 and alpha0 to the copy in the inverted repeat, because BugSpray arbitrarily selects only one best hit if there are multiple hits with the same weight.
Only Orf-P does not have a BLASTp hit to a gene in the HSV-1 genome.
BUGSPRAY GENOME ALIGNMENT
CODON USAGE
Codon usage as determined by
CodonW,
written by John Peden pdxjfp@molbiol.ox.ac.uk, while in the laboratory of Paul Sharp at the University of Nottingham:
Phe UUU 636 0.90 Ser UCU 159 0.36 Tyr UAU 187 0.36 Cys UGU 186 0.50
Leu UUA 41 0.06 UCA 58 0.13 TER UAA 25 0.90 TER UGA 32 1.16
UUG 261 0.38 UCG 737 1.66 UAG 26 0.94 Trp UGG 500 1.00
CUU 241 0.35 Pro CCU 187 0.19 His CAU 174 0.30 Arg CGU 210 0.32
CUC 1098 1.59 CCC 2064 2.08 CAC 975 1.70 CGC 1974 3.01
CUA 162 0.24 CCA 204 0.21 Gln CAA 172 0.27 CGA 269 0.41
CUG 2332 3.38 CCG 1513 1.53 CAG 1086 1.73 CGG 1199 1.83
Ile AUU 158 0.42 Thr ACU 66 0.11 Asn AAU 93 0.21 Ser AGU 104 0.23
AUC 872 2.31 ACC 1159 1.98 AAC 775 1.79 AGC 730 1.65
AUA 104 0.28 ACA 113 0.19 Lys AAA 169 0.49 Arg AGA 74 0.11
Met AUG 693 1.00 ACG 1000 1.71 AAG 524 1.51 AGG 210 0.32
Val GUU 281 0.37 Ala GCU 215 0.14 Asp GAU 339 0.29 Gly GGU 237 0.27
GUC 1182 1.55 GCC 3196 2.10 GAC 1976 1.71 GGC 1567 1.76
GUA 150 0.20 GCA 222 0.15 Glu GAA 391 0.37 GGA 324 0.36
GUG 1441 1.89 GCG 2441 1.61 GAG 1724 1.63 GGG 1443 1.62
43738 codons in 83 genes (used Universal Genetic code)
G+C CONTENT
The G+C content of the HSV-2 genome is 70.4%, compared with 68.3% for HSV-1.
C+G content for each orf is listed here and graphed below.
The three genes with the highest G+C content (>2 sigma) all lie within the long terminal repeats;
gamma34.5 (81.7%),
alpha4 (81.5%) and
orf-P (within gamma34.5 at 80.3%).
Genes with the lowest G+C content include
UL1/gL (60.7%),
UL55 (61.1%),
US12 (62.4%),
UL20 (63.4%),
UL5 (63.6%) and
UL3 (63.7%).
GENERAL FEATURES OF THE PROTEOME
Current analysis of this genome shows 79 distinct genes, 4 of which have
two copies for a total of 83 genes. Three of these 79 genes have intron(s)
and they are
gamma34.5,
alpha0 and
UL15. Interestingly
UL16and
UL17code within the intron of
UL15. Nine of the genes (11%) have been assigned an EC number while ten (13%) have unknown function.
Three pairs of genes are known to be coterminal and code in frame. These include:
UL8.5 and
UL9,
UL15 and
UL15.5 and
UL26 and
UL26.5.
Also three pairs of genes are known to overlap completely coding on opposite strands. They are:
gamma34.5 and
Orf-P,
UL27.5 and
UL27 and
UL43 and
UL43.5.