Department of Anthropology, Harvard University, Cambridge, Massachusetts
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
CG is a glycoprotein hormone expressed in the human placenta which acts as a signal to the maternal physiology to establish pregnancy. Specifically, the binding of CG molecules to LH/CG receptors on the corpus luteum prevents regression of the corpus luteum at menstruation and stimulates continued progesterone production which maintains the uterine lining in a specialized state receptive to implantation and placental development. CG is a member of a larger family of glycoprotein hormones which includes luteinizing hormone (LH), follicle stimulating hormone (FSH), and thyroid stimulating hormone (TSH). Each of these hormones is composed of two protein subunits. The alpha subunit (here labeled GPH) is shared by all four glycoprotein hormones, whereas each of the four hormones has a unique beta subunit which confers biological specificity. Of the four glycoprotein hormones, only CG is expressed in the placenta; the other three are expressed in the anterior pituitary gland.
The first nucleotide sequence of a human gene encoding the beta subunit of CG (CGß) suggested that CGß evolved from a duplicate copy of the beta subunit of the related glycoprotein hormone LH (Fiddes and Goodman 1980
). Subsequent nuclear mapping has shown that humans possess six copies of the CGß gene, found together with the single copy of the human LH beta subunit (LHß) gene on chromosome 19p13.33 (Policastro et al. 1986
; Graham et al. 1987
). Human CGß and LHß genes share a high degree of sequence similarity (94%), and previous analyses suggest that CGß genes may be evolving under a regime of positive selection (Talmadge, Vamvakopoulos, and Fiddes 1984
).
From a recent phylogenetic analysis of genes from the entire glycoprotein hormone family, Li and Ford (1998)
have proposed that the CGß gene first arose around 94 MYA. If CGß were indeed that old, it would predate the origin of eutherian mammals and should therefore be widespread in living mammalian taxa. Yet, genomic analyses have clearly shown that CGß genes do not exist in rats (Jameson et al. 1984
; Tepper and Roberts 1984
; Carr and Chin 1985), mice (Kumar and Matzuk 1995
), cows (Virgin et al. 1985
), pigs (Ezashi et al. 1990
), sheep (Brown et al. 1993
), and rhinoceri (Lund and Sherman 1998
). Within Primates, biological and immunological assays have found CG in every species tested, although tests have been almost exclusively limited to anthropoid primates (Tullner 1974
; Hobson and Wide 1981
). The only exception is a report of CG from the term placenta of a lemur (Hobson and Wide 1981
), but the amount of CG reported is very low, and the study reported no negative controls, so this result may represent a spurious nonspecific immunological cross-reaction. Thus, the origin of the gene encoding the CGß subunit in primates is unknown, but it would appear to fall between the common ancestor of eutherian mammals and the common ancestor of anthropoid primates.
There are a minimum of four molecular evolutionary events that must have occurred in order to evolve a functional CGß gene (as found in humans) from an LHß gene ancestor. These include (in no specific order) (1) the original duplication of the ancestral LHß gene, (2) a frameshift mutation in the third exon of the duplicated gene, (3) expression gain of the duplicated gene (or its prototype) in the placenta and expression reduction in the pituitary, and (4) expression gain of the GPH gene in the placenta with expression retention in the pituitary (for use in LH, FSH, and TSH). Here we report on experiments designed to determine when these molecular changes occurred.
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Southern Blotting
The genomes of 11 extant primates and 3 related mammalian species were surveyed by Southern blotting to determine the number of CGß plus LHß genes present in each species. All restriction enzymes and reaction buffers were from Life Technologies (Gaithersburg, Md). Eleven micrograms of digested DNA was electrophoresed in a 0.6% agarose gel, transferred onto nylon membranes using the Turboblotter kit (Schleicher & Schuell, Keene, NH), and immobilized by UV cross-linking according to the manufacturer's protocol.
Hybridization probes radiolabeled with 32P--dCTP (NEN Life Sciences) were synthesized using the NEBlot random-primed labeling kit (New England Biolabs, Beverly, Mass.). Probes were synthesized from five different templates: Homo CGß5, Aotus CGß, Tarsius LHß, Galago LHß, and Cynocephalus LHß. Template DNA for the labeling reaction was prepared by polymerase chain reaction (PCR) amplification (see later). Hybridization was carried out using PerfectHyb Plus Buffer (Sigma Chemical Co.), supplemented with 100 µg/ml yeast tRNA. Washed membranes were exposed to X-ray film for 214 days. Autoradiographs were scanned, and the program NIH Image (Rasband and Bright 1995
) was used for densitometric analyses of hybrid-band intensity.
PCR, Cloning, and DNA Sequencing
LHß and CGß genes were amplified using either the Taq Mastermix kit (Qiagen Inc., Valencia, Calif.) or the Platinum Pfx Polymerase kit (Life Technologies, Gaithersburg, Md); the latter is preferable for cloning because of its lower error rate. PCR reactions were amplified in an Eppendorf MasterCycler Gradient thermocycler. Reactions used one of two different upstream primers (M69F or P06F, table 1
) and one of two different downstream primers (1096R or 1105R). With primer M69F, the expected product was only weakly amplified, so the product was gel-purified and reamplified using primer P06F in the place of M69F. The proximal promoter of the GPH gene was amplified using the Qiagen Taq Mastermix kit with primers CGA-M205F and CGA-P63R which were designed to match regions conserved among human, rat, mouse, cow, and horse (Steger et al. 1991
). PCR products were sequenced directly using the same primers. PCR was also used in some species to amplify the noncoding space between adjacent CGß-LHß gene copies in order to test for the number of gene copies. For these reactions, the Failsafe PCR kit (Epicentre Technologies, Inc.) was used with primers 1080F and P25R; buffer G was found to be the optimal reaction buffer.
|
Sequencing reactions used the Prism DyeTerminator Cycle Sequencing kit (Applied Biosystems, Inc.). All clones were sequenced in both directions, starting with the vector primers and then using four to six additional internal primers (23 in each direction) designed to match the sequence differences found in each species' clones. The complete list of universal and species-specific primers used in sequencing is given in table 1 . This sequencing strategy guaranteed that every base was covered by at least two sequencing reactions, and most were covered by four.
DNA Sequence Analysis
The full-length nucleotide sequence for each clone was assembled from the individual sequencing reactions using the program AutoAssembler (Applied Biosystems, Inc.). Clones were then aligned to each other using the program Clustal X (Thompson et al. 1997
), first for all of the clones from one species only, then later to align sequences from different species. Treating LHß and CGß clones separately, all the clones from a single species were first aligned together, and any single nucleotide variants that occurred uniquely in single clones were assumed to be PCR errors and changed to match the consensus nucleotide found in every other clone at that site. The same was done for uniquely occurring clones which appeared to be PCR mosaicsartifactual recombinant clone sequences composed of pieces amplified from two different CGß gene copies. In cases where PCR recombination was suspected to occur uniquely in a single clone, the mosaic clone was removed from consideration. Duplicate clones were then removed from the data set until a minimum number of clones remained which represented the total amount of nonunique sequence diversity found in the clones of a given species. Although this strategy to remove PCR errors and PCR mosaics may also remove some allelic variation from the data set, we were more interested in retaining sequences from each different locus in a species, rather than all polymorphic variants. Alignments were then inspected by eye to improve the alignment. Phylogenetic analyses on the sequences were performed using PAUP* (Swofford 1998
).
Maximum likelihood analyses of substitution rates were performed using the program codeml in the PAML software package (Yang 2000)
, which uses a codon-based substitution model (Goldman and Yang 1994
). The following parameter settings were used: codon frequencies were estimated from the average nucleotide frequency at each codon position; the transition-transversion ratio was estimated from the data; rates were assumed to be equal for all sites (no gamma correction); the correlation coefficient was assumed to be zero; and a molecular clock was not assumed. The user-input tree for each analysis was previously determined by maximum likelihood search using PAUP*; these trees always agreed with well-established phylogenetic relationships of primates.
Statistical tests for gene conversion were performed using the program GENECONV 1.81 (Sawyer 2000
). Only the sequences from species for which multiple sequences were cloned (human, orangutan, rhesus, leaf monkey, and guereza) were input as data. By default, the program uses all polymorphic sites in an alignment in scoring the likelihood of conversion between two sequences in a given stretch of DNA. As the sequences in this alignment come from different species, some polymorphic sites vary between species but not within species; using the between-species polymorphic sites would artificially increase the likelihood of finding proposed conversion events between sequences within a species. To avoid this problem, all the sequences from each species were defined as a group for the sake of the analysis, which limits the program to using only sites which are polymorphic within a group (species) when testing for conversion events between sequences of the same group. Analyses were repeated where mismatches were either not allowed, or allowed but given a relative penalty of 1, 2, or 5. These four settings tended to produce similar numbers of significant fragments (likely gene conversions), but they differed in the estimated lengths of the converted fragments.
![]() |
Results and Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
|
Tarsier has two hybridizing fragments (fig. 1A, lane 5). The DNA sequence of the tarsier gene has a single BamHI cut site 222 nucleotides (nt) from the 3' end of the gene, thus it would appear that tarsier has just one hybridizing gene sequence, represented by two fragments in the KpnI-BamHI double digest. To confirm that tarsier does not have any CGß genes and has only one LHß gene, a more detailed genomic analysis was conducted using an array of enzyme digests and using tarsier LHß as the probe (fig. 1B ). All of the digests produce either one or two hybridizing fragments, consistent with the number of enzyme cut sites predicted from the tarsier DNA sequence. Thus, tarsier has just one cross-hybridizing sequence in its genome, and DNA sequencing indicates it is an LHß gene, not a CGß gene.
For all the strepsirrhine primate species analyzed (lemur, aye-aye, loris, and bushbaby), a single gene copy hybridized to the Galago LHß probe (fig. 1A, lanes 69) and to a human CGß probe (not shown). To further confirm these results, a more detailed genomic analysis was conducted for loris. Five different enzyme digests all produce just one hybridizing fragment (fig. 1C ), indicating that there is just one gene copy present in the loris genome. DNA sequence analysis shows that the single gene in all of these strepsirrhine species is an LHß gene. Therefore, all of the strepsirrhine primates lack CGß genes. Both Southern blotting and DNA sequencing also found a single LHß gene and no CGß genes for the flying fox (fig. 1A, lane 10), the vampire bat (not shown), and the colugo (not shown).
The New World monkey Aotus has from one to five hybridizing copies, depending on the hybridization probe used (fig. 1D
). A human CGß probe only hybridizes to one fragment (fig. 1D,
lane 2), whereas a Galago LHß probe hybridizes with 34 fragments (fig. 1D,
lane 3), and with an Aotus CGß probe (self-hybridization), five dark bands hybridize, as well as five or more lightly hybridizing bands (fig. 1D,
lane 4). A single KpnI cut site is predicted from the Aotus CGß DNA sequence (subsequently) at 100 bp downstream from the start of the CGß coding sequence, suggesting a single Aotus gene should be represented by two fragments, one about 10 times darker than the other. One way to interpret these results is that Aotus has five LHß-like genes in its genome, some of which are recently evolved pseudogenes of either LHß or CGß. This would explain why probes of different phylogenetic distance from Aotus gave different hybridization patterns. Presumably, at least one of the five genes in Aotus is a functioning LHß gene, but from the blot results alone, it is not clear if any of the other four genes are CGß and, if so, whether any are functional.
In Callicebus, probes from both human (fig. 1D,
lane 6) and Aotus (fig. 1D,
lane 7) CGß hybridize to three fragments. The Callicebus CGß sequence predicts a KpnI site at 82 bp from the 5' end of the PCR fragment, as well as a BamHI site 910 bp from the 5' end. This indicates that in Callicebus three fragments (one of which would be 800 bp and the other two of unknown length) represent a single gene. The Southern blot results are thus consistent with the presence of only one gene copy in Callicebus, yet PCR amplification and cloning finds only CGß-like sequences for both Callicebus and Aotus.
In order to further characterize the genomes of these New World monkey species, a different experimental approach was used to estimate the number of tandemly arrayed gene copies present. PCR amplification was performed with primers designed to amplify between gene copies if and only if more than one gene is present. In humans, seven linked genes are present; so six PCR products are expected. These six fragments are predicted to form three pairs of similarly sized intergenic spaces based on the previously determined map of the cluster (Policastro et al. 1986
). Three amplified products of the expected sizes are clearly obtained using human DNA (fig. 1E,
lane 1). Tarsier, with only one gene, should not amplify any product, and none is seen (fig. 1E,
lane 4). Both Aotus (lane 2) and Callicebus (lane 3) amplify a single band; from this we can infer that these New World monkeys both have at least two linked LHß-CGß genes; they may possibly have more if the additional genes are either (1) spaced exactly the same distance apart from the first two, or if (2) the additional copies are too far away to allow PCR amplification between them (the upper limit on amplifiable fragments is around 20 kb; Epicentre Technologies technical staff, personal communication). Both the Aotus and Callicebus sequences found here from cloned PCR amplifications have a fairly low degree of sequence divergence (7.9% and 9.4%, respectively) from the previously published CGß cDNA sequence for marmoset (Callithrix jacchus, Simula et al. 1995
), and there are no nonsense mutations in the Aotus and Callicebus CGß sequences to suggest that either is a pseudogene. Comparisons among our Aotus CGß sequences find zero sequence variation in 30 clones from 3 amplifications using different primers, all of which bind in coding regions. As a further measure, all three New World monkey CGß sequences are roughly equidistant from human CGß (19.4%20.4%) and from tarsier LHß (23.1%25.0%). These results suggest that both Aotus and Callicebus CGß sequences generated here are likely to be functional genes.
To summarize, all of the anthropoid primate species tested here show evidence of one or more CGß genes in their genome. The tarsier, all of the strepsirrhine primates, and the three nonprimate outgroup species have all been shown to have exactly one gene copy (later shown by sequencing to be LHß genes). Therefore, the most parsimonious reconstruction places the first gene duplication event in the common ancestor of the anthropoid primates, after the divergence of the tarsier lineage (between nodes D and C, fig. 2
). From fossil and molecular phylogenetic studies, this places the origin of the CGß gene between 50 and 34 MYA (Bailey et al. 1991
). This date is considerably more recent than the 94-MYA origin date proposed previously from calibrated molecular clock calculations (Li and Ford 1998
). If we take the average DNA sequence divergence between the introns of the tarsier LHß gene and the anthropoid CGß genes, corrected by the Hasegawa, Kishino, and Yano (1985)
model, and divide by twice the estimated age of the tarsier-anthropoid common ancestor (50 Myr), we get a mutation rate of 3.37 x 10-9 substitutions per site per year, which is in close agreement with previous estimates for noncoding DNA in primates (Bailey et al. 1991
). This supports our estimate of the age of the origin of the first CGß gene.
|
Evolution of CGß-Specific Sequence Characteristics
The primary differences between LHß and CGß are in their gene expression patterns (LHß in pituitary, CGß in placenta), and in the lengths of their coding sequence. Human CGß genes have a single-base deletion relative to the human LHß gene at position +988 (counting from the first translated base of exon 1), which is eight codons before the LHß termination codon. This causes a frameshift which incorporates much of what is the 3'-untranslated region in LHß into the third exon of CGß, in turn adding 24 amino acids to the length of the CGß peptide. A two-base insertion in human CGß (relative to human LHß) at position 1060 again adjusts the reading frame in human CGß, which produces a termination signal eight codons later. Knowing that the CGß gene family first arose in the anthropoid common ancestor, the next step was to reconstruct when these CGß-specific characteristics evolved.
CGß and LHß genes were cloned and sequenced from 13 previously uncharacterized species, and also from human, in which only two of six CGß loci had been completely sequenced at the time this study began. The species sequenced and the number of sequences reported for each species is shown in table 2 . Genomic DNA sequences produced here start at the 13th base of the first intron and include the rest of intron A (340 nt), exon 2 (168 nt), intron B (234 nt), all of the coding portion of exon 3 for LHß genes, and all of the coding region, except the last amino acid of exon three for the CGß genes. The first exon includes only 15 translated nucleotides, and the amino acids encoded in this region are not part of the mature protein, so their absence does not hinder analyses of the evolution of the protein's function. A total of 36 new LHß and CGß sequences were produced in this study. The aligned sequences are available in GenBank (accession numbers AF397576AF397611). Given the experimental approach used here, we cannot determine whether clones from a given species are alleles of the same physical locus or whether they are representatives of different loci. Furthermore, we cannot say definitively whether all (or only some) of the LHß and CGß copies present in a given species were amplified and represented among the clones sampled for each species.
|
Inspection of the aligned sequences reveals that the second frameshift, the two-base deletion at sites 10601061 (12111212 in the alignment), is found only in the human LHß gene; all other sequences have the same bases at these sites as in the human CGß genes. Thus, this event is most parsimoniously reconstructed as a deletion mutation which occurred uniquely in the 3'-untranslated region of the LHß gene recently, after the orangutan lineage diverged from that leading to humans, chimps, and gorillas. Therefore, this two-base frameshift is not an important event in the deeper phylogenetic evolution of the CGß gene family.
At least one cloned sequence from each anthropoid species studied was found to contain the single-base deletion at position 988 (site 1137 of the aligned sequences). In the absence of data on the actual expression patterns of each gene sequence produced from each species, the only way to define a sequence as being either an LHß or a CGß gene is based on the presence or absence of this single-base deletion. Using this criterion, all of the anthropoid species tested here possess a CGß gene. All of the species possessing only one gene lack the deletion at this site, identifying these unique genes as LHß genes. In all the catarrhines studied here, a single gene lacking this deletion (i.e., LHß) was also found. In both New World monkeys, even though two gene copies were found by intergene PCR, no LHß sequence was found; all sequences had the deletion and thus were CGß using the above criteria. LH has been found in every vertebrate species tested to date, and it plays a vital role in the regulation of reproduction, so it is unlikely that New World monkeys lack an LHß gene. Rather, it is more likely that a sequence mismatch between one of the PCR primers and the New World monkey LHß sequences caused a PCR amplification failure of the LHß gene in these species, thus explaining its absence among the sequences generated in this study. Overall, it is most parsimonious to reconstruct the occurrence of the CGß-specific deletion event after the initial LHß gene duplication, but before the divergence of the catarrhines from the platyrrhines, placing it very early in the evolutionary history of the CGß genes.
Gene Conversion in CGß Genes
Given that the DNA sequences of the (linked) human LHß and CGß genes are more similar to each other than is human CGß to other species' CGß genes (Crawford, Tregear, and Niall 1986
; Simula et al. 1995
, and this study), it is clear that the CGß and LHß genes are not evolving independently. One of the prominent mechanisms of concerted evolution that could affect the evolution of these genes is gene conversion, in which a nonreciprocal transfer of DNA sequence information occurs, with a portion of one gene acting as the parent, copying its sequence into the daughter or recombinant gene. There are numerous ways to test for gene conversion. If we first reconstruct the phylogenetic tree for LHß sequences alone (fig. 3A
), the tree topology matches the well-established phylogenetic relationships of primates. However, when the CGß sequences are added to the LHß set, the phylogenetic results clearly deviate from the pattern expected if the loci were to have evolved independently (fig. 3B
). Instead of observing a tree with two halves (one half with LHß sequences, the other with CGß sequences) in which each half reflects the same topology, we observe a tree of intermixed LHß and CGß sequences. The hominoid LHß and CGß sequences form a separate clade, as do the Old World monkey LHß and CGß sequences. Within each clade, the LHß sequences generally group together, sister to the CGß sequences. This pattern suggests significant gene conversion between the LHß and CGß genes in both the common ancestors of the large-bodied hominoid and cercopithecoid clades. The one exception, in which rhesus LHß groups most closely with the rhesus CGß sequences, represents a more recent gene conversion within the genus Macaca.
|
|
Functional Amino Acid Changes in the Evolution of CGß
The only appreciable difference in the structure and molecular properties of these two hormones is in the number of sugar chains attached to each. Human LHß has two N-linked glycosylations at amino acid sites 13 and 30 of the mature peptide, whereas human CGß has two N-linked glycosylations at homologous sites and also four O-linked glycosylations at serine residues 121, 127, 132, and 138. The effect of these glycosylations is to slow the clearance of CG molecules from the maternal bloodstream; LH has a circulatory half-life of about 30 min (de Kretser, Atkins, and Paulsen 1973
), whereas the half-life for CG is roughly 12 h (Braunstein, Vaitukaitis, and Ross 1972
). This property is believed to make CG molecules more effective at establishing pregnancy than the equivalent dose of LH (Matzuk et al. 1990
; Hearn and Gomme 2000
). Thus, the evolution of these four serine residues in CGß could be regarded as an adaptation, and it is of interest to reconstruct when and how these sites evolved.
A comparison of the predicted amino acid sequences (fig. 4 ) reveals that two of the serine residues (serines 121 and 132) which are glycosylated in humans are present in all of the anthropoid CGß genes. The two New World monkeys have only these two serines and lack the other two (127, 138) which are glycosylated in humans. Therefore, it is likely that the serine residues at 121 and 132 were present in the ancestral CGß gene around the time when the frameshift mutation was introduced, and this region of sequence was translated for the first time. It can be inferred from the amino acids sequences (fig. 4 ) that there was an asparagine to serine substitution at site 127 in the catarrhine common ancestor and an alanine to serine substitution at site 138 in the large-bodied hominoid ancestor, based on the shared presence or absence of serines in each species in the study. Assuming that the addition of each serine (and thus a sugar chain on that serine) has an incremental effect on the metabolic clearance rate of the molecule, these results suggest that there may have been selection acting to increase the effectiveness of CG at resisting metabolic clearance throughout anthropoid evolution, particularly in the catarrhines.
|
Looking first at just the LHß genes, there is no evidence of positive selection along any branches (table 4A
). If a null model assuming a uniform for the entire tree is assumed, the estimated
for the data set is 0.22 (table 4A
, line 1). Although a free-ratio model assuming an independent
value for each branch (line 2) is a significantly better fit than the null model (2
L = 36.68, P < 0.025, df = 20), none of the branches have
values significantly greater than 1.0. Thus the LHß genes appear to be evolving under a regime of generally negative or purifying selection, as is found in most functioning genes (Li 1997
, pp. 179182).
|
The likelihood analyses on these CGß sequences (table 4B
) show that the free-ratio model fits the CGß sequences best, although it is not a significantly better fit than the single-ratio model (2L = 24.46, 0.05 < P < 0.10, df = 16). The
values estimated for each branch on the tree under the free-ratio model (not shown) are all less than or near to the neutral expectation of 1.0 except for the terminal Aotus branch, which has an
of 3.3. To investigate the robusticity of this finding, a third model was constructed in which two distinct
values were hypothesized: one for the Aotus branch and one for the rest of the tree (table 4B,
line 3). This model is significantly better than the single-ratio model (2
L = 7.08, P < 0.01, df = 1), supporting the conclusion that there has been positive selection in CGß along the Aotus branch.
Selection can sometimes act on a specific domain or structural subportion of a protein in remodeling it for a new function (Li 1997
, p. 186). In these cases, evidence of positive selection would not be detected if the entire protein sequence were analyzed, because the
value is averaged across the entire sequence. In the case of the CGß genes, there is reason to believe that different portions of the protein may have been subject to different selection pressures, in particular some portion of the third exon of CGß, which is not involved in gene conversion events with LHß. A logical, biologically defined region to test is the carboxyl-tail created by the frameshift which distinguishes CGß from LHß.
To test this possibility, selection analyses were conducted on just the carboxyl-terminal 33 codons. The free-ratio model (table 4C,
line 2) is not a significantly better fit than the uniform-ratio null model, (2L = 19.86, P =
0.10, df = 13), although this model is the best fit for the data overall (table 4C
). Although not statistically rigorous, it is useful informally to look at the
values calculated for each branch under the free-ratio model to survey the variability of the
values and identify branches which appear to be subject to positive selection. Individual
values and the reconstructed number of changes for each branch calculated under the free-ratio model are shown in figure 5
.
|
Next, we can ask if each of these six branches individually contributes significantly to the robustness of the seven- model. To do this, we test a model which assumes only six
values:
0 and only five branches which are different from the background. This model is tested six times (table 4C
), in turn assuming the
for one of the branches AF is the same as
0. For branches A, B, and DF, the seven-
model is not significantly better than the six-
model (comparing lines 4, 5, 7, 8, and 9 against line 3), yet the six-
model is significantly better than the null model, so it appears that each of these branches alone does not contribute to the significance of the seven-
model. Only branch C makes a significant contribution to the seven-
model, for when branch C is set equal to
0, the resulting six-
model (table 4C,
line 6) does not reject the null model (2
L = 5.80, P > 0.10, df = 5), and the seven-
model is significantly better than this six-
model (2
L = 8.96, P < 0.01, df = 1).
These numbers indicate that whereas the tail region of the CGß gene has been subject to varying degrees of negative (purifying) selection throughout most of its evolution in primates, there have been periods of positive selection acting on this portion of the CGß gene in the platyrrhines, especially along the lineage leading to Callicebus after it diverged from the ancestor of Callithrix and Aotus. The analyses also find weak positive selection acting along the terminal Aotus and Callicebus lineages, the ancestral New World monkey lineage, and in the hominoid lineages both before and after the Homo lineage diverged from the lineage leading to Pongo; however, the data are not robust. A total of 20.1 amino acid changes in this stretch of just 33 sites are inferred in the Callicebus CGß gene since the common anthropoid ancestor, which translates to a rate of about 0.015 amino acid replacements per site per lineage per Myr, assuming the anthropoid ancestor lived 40 MYA. This rate is approximately five times faster than that seen in the interferons, which are some of the fastest evolving proteins known (Li 1997
, pp. 180181). Therefore, the new amino acids in the 3'-tail region may play an important role in the function of CG in the New World monkey clade represented here by Callicebus. It is noteworthy that the analysis of the CGß-tail did not find strong positive selection in the Aotus lineage, yet the earlier analysis of the entire CGß coding sequence did find positive selection in Aotus. This would suggest that the Aotus and Callicebus CGß genes are evolving under different selection pressures.
Evolution of Placental Expression of the Alpha Subunit
Two of the significant molecular changes that must have occurred in the evolutionary history of primate CG are expression changes: CGß had to gain expression in the placenta (and, at least in humans, greatly reduce expression in the pituitary), whereas GPH had to gain placental expression, yet retain pituitary expression. The promoter of the CGß genes is not understood well enough to identify specific nucleotide sites critical to placental expression. On the other hand, the expression of the GPH
gene is relatively well characterized and in the human placenta, depends on the presence of two promoter elements. The first, often called the trophoblast-specific element (TSE), spans nucleotides -180 to about -146 from the transcription initiation site (Roberts and Anthony 1994
). The second promoter element is a perfect copy of the eight-base long cyclic AMP response element (termed CRE) (Bokar et al. 1989
; Nilson et al. 1991
). Humans have two copies of the CRE (boxed in figure 6
and labeled below the alignment). It has been shown previously that the TSE is a required element for placental expression in humans, but it alone is not sufficient; at least one copy of the CRE is needed (Nilson et al. 1991
). Therefore, this study focused on the CRE sequences. To investigate the evolution of these promoter elements, a 270-nt fragment of the GPH
gene was PCR amplified and sequenced from Homo, Pongo, Presbytis, Colobus, Aotus, Callicebus, and Tarsius. These sequences are available in GenBank (accession numbers AF4019917), and a portion of them is shown in figure 6
.
|
Orangutans are not the only species with this substitution at -139 in the CRE element. The T base at this position appears to be the ancestral state for this base, as it is found in horses, cows, mice, rats, and rabbits (Steger et al. 1991
), and also in tarsiers (this study; fig. 6
). This is not surprising, given that most of these species neither produce CG nor express GPH
in the placenta (with the exception of horses, discussed subsequently). In contrast, Homo, Gorilla, Pongo, Macaca, Presbytis, and Colobus all have a C at position -139, matching the consensus CRE sequence; this is consistent with the fact that each of these species would need to express the GPH
gene in the placenta in order to produce CG hormone. The most parsimonious reconstruction places the original (activating) T to C substitution at site -139 in the lineage leading to the common catarrhine ancestor. The mutation observed in the orangutan sequence reverting the consensus CRE back to the ancestral state (T) at this position presumably could not have occurred until after the second CRE element was inserted, because one functional CRE is essential for placental CG production and the maintenance of pregnancy.
Despite having at least one CGß gene, both Aotus and Callicebus have a GPH promoter with the ancestral T at position -139. They therefore do not have a consensus copy of the CRE known to be necessary for placental expression in humans. Nevertheless, placental expression of CG has clearly been shown in these and other New World monkey species (Hodgen et al. 1976
; Hobson and Wide 1981
; Crawford, Tregear, and Niall 1986
; Einspanier et al. 1999
). Therefore, the New World monkeys must have a mechanism of expression control for their GPH
gene that is different from the CRE-based mechanism in catarrhines.
Horses have evolved placental LH expression, which is functionally convergent upon the anthropoid CG, but with a different molecular basis (Sherman et al. 1992). GPH is expressed in the horse placenta, but the promoter of the equine GPH
gene does not have a CRE: it differs by the same T nucleotide at position -139 found in the other nonCG-producing species (Steger et al. 1991
; fig. 6
). Rather, DNase-1 protection assays have shown that the regulatory protein
-ACT binds to the horse GPH
promoter (Steger et al. 1991
). The Aotus sequence perfectly matches the horse sequence in the
-ACT binding site region (sites -161 to -142), and the Callicebus sequence differs by just one base. In fact, the human sequence also matches the horse in this region, yet binding of
-ACT alone to the human GPH
promoter does not stimulate expression (Steger et al. 1991
), so from the sequence alone we cannot predict if
-ACT promotes GPH
expression in the New World monkeys. Whatever their mechanisms of expression control for the GPH
gene, this system may have evolved either within the platyrrhine lineage in parallel to the CRE-based control system of the catarrhines or it may have evolved before the anthropoid common ancestor and then later been replaced in the catarrhine lineage by the CRE system.
CG and Models of the Evolution of New Genes
The evolution of CG is a case study of how new genes with new functions arise from existing genes. The classical model (Ohno 1970
, p. 71) posits that genes duplicate before new functions evolve; one gene copy retains the required ancestral function, whereas the fate of the second copy is likely to be nonfunctionalization, in which a mutation abolishes its ability to be expressed or to carry out the function of its progenitor. Only in rare cases does a duplicate gene evolve a new, selectively advantageous function under this model. A recent genome-wide survey shows that gene duplications occur at a high rate, and fewer pseudogenes exist than predicted by the classical model (Lynch and Conery 2000
). This empirical observation prompted the proposal of an alternative model in which genes evolve multiple functions before duplication. Duplicate genes then evolve by subfunctionalizationthe fixation of complementary degenerative mutations in the daughter copies, which requires the presence of both copies to maintain the ancestral functions (Force et al. 1999
; Lynch and Force 2000
). It predicts a much higher probability for the preservation of actively expressed duplicate genes than does the classical model.
The subfunctionalization model prompts a reexamination of CG evolution in ways that the classical model does not. It is possible that the ancestral primate LHß gene was expressed in both the pituitary and the placenta, as is the case with horse LHß. Additionally, primate LHß and CGß genes use different upstream regions for their proximal promoter elements, so that the promoters of these two genes do not overlap. It is not known whether the LHß gene is expressed in the placentae of any extant species, but it has been shown that there is transcription of all of the human CGß genes in the pituitary, although at very low levels (Dirnhofer et al. 1996
). Finally, it is unclear from the reconstruction of events in this study when CGß first gained placental expression, but it is possible that placental expression could have been gained before gene duplication. If the ancestral LH gene had acquired placental expression before the reconstructed gene duplication event (which could be tested by examining tarsier first-trimester placentaea rare commodity), this would prove that CG evolved from LH by subfunctionalization.
The Origin of CG as a Functional Signal of Pregnancy
If placental expression of LH evolved before the origin of the CGß gene family, then the origin of the functional activity of establishing pregnancy via a placentally expressed gonadotropic hormone precedes, and is independent of, the gene duplication event leading to the origin of the CGß subunit gene. It may then be a different question to ask when this placental gene expression first evolved. Could placentally expressed LH have been functioning to establish pregnancy as long ago as the origin of placental mammals? The evolution of placental morphology suggests not. LH (or CG) has to move from the placenta into the maternal bloodstream and then be transported to the ovary in order to act on its target, the corpus luteum. Anthropoid primates all have a hemochorial placenta, in which placental tissue is directly bathed in maternal blood, making it easy for placentally derived molecules to enter the maternal bloodstream (King 1993
). Strepsirrhine primates and most other mammals have an epitheliochorial placenta, in which both the uterine epithelium and the maternal vascular endothelium remain present during pregnancy. These two additional tissue layers impede the flow of large macromolecules (such as the glycoprotein hormones LH and CG) from the placenta to the maternal bloodstream (Faber, Thornburg, and Binder 1992
). Given the widespread distribution of mammals possessing an epitheliochorial placenta, this is thought to be the ancestral state of mammalian placentation (Luckett 1976
).
We propose that LH could not have been efficiently delivered to the corpus luteum before hemochorial placentation evolved. In primates, hemochorial placentation first appears in tarsiers, making the haplorhine common ancestor the first primate that would have been capable of making efficient use of LH as a pregnancy-establishing signal. It should be noted that horses, which have evolved CG independently from primates, have also evolved specialized placental structuresendometrial cupswhich help in the delivery of equine CG to the mare's bloodstream (Mossman 1987
, p. 271). A recent report proposes that guinea pigs (Cavia porcellus) have also independently evolved CG (Sherman et al. 2001
); however, the evidence is not convincing (primarily because the authors present Northern blots showing no CG expression in the guinea pig placenta). Nonetheless, guinea pigs have evolved a labyrinthine type of hemochorial placentation (Mossman 1987
, p. 230), so if CG were to have evolved in this species, it could be efficiently delivered to the maternal bloodstream. This final discussion shows the power of combining the molecular evolutionary history of a gene with the anatomical evolution of a tissue expressing that gene, in order to build a more complete picture of the evolution of a higher primate adaptation.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Abbreviations: CG, chorionic gonadotropin; LH, luteinizing hormone; CGß, CG beta subunit; LHß, LH beta subunit; GPH, glycoprotein hormone alpha subunit.
Keywords: chorionic gonadotropin
molecular evolution
positive selection
reproductive hormones
primates
gene expression
Address for correspondence and reprints: Glenn A. Maston, Peabody Museum 56A, 11 Divinity Avenue, Cambridge, Massachusetts 02138. maston{at}fas.harvard.edu
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bailey W. J., D. H. A. Fitch, D. A. Tagle, J. Czelusniak, J. L. Slightom, M. Goodman, 1991 Molecular evolution of the -globin gene locus: gibbon phylogeny and the hominoid slowdown Mol. Biol. Evol 8:155-184[Abstract]
Bokar J. A., R. A. Keri, T. A. Farmerie, R. A. Fenstermaker, B. Andersen, D. L. Hamernik, J. Yun, T. Wagner, J. H. Nilson, 1989 Expression of glycoprotein hormone alpha-subunit gene in the placenta requires a functional cyclic AMP response element, whereas a different cis-acting element mediates pituitary-specific expression Mol. Cell. Biol 9:5113-5122[ISI][Medline]
Braunstein G. D., J. L. Vaitukaitis, G. T. Ross, 1972 The in vivo behavior of human chorionic gonadotropin after dissociation into subunits Endocrinology 91:1030-1036[ISI][Medline]
Brown P., J. R. McNeilly, R. M. Wallace, A. S. McNeilly, A. J. Clark, 1993 Characterization of the ovine LH beta-subunit gene: the promoter directs gonadotrope-specific expression in transgenic mice Mol. Cell. Endocrinol 93:157-165[ISI][Medline]
Carr F. E., W. W. Chin, 1985 Absence of detectable chorionic gonadotropin subunit messenger ribonucleic acids in the rat placenta throughout gestation Endocrinology 116:1151-1157[Abstract]
Courseaux A., J.-L. Nahon, 2001 Birth of two chimeric genes in the Hominidae lineage Science 291:1293-1297
Crawford R. J., G. W. Tregear, H. D. Niall, 1986 The nucleotide sequences of baboon chorionic gonadotropin ß-subunit genes have diverged from the human Gene 46:161-169[ISI][Medline]
de Kretser D. M., R. C. Atkins, C. A. Paulsen, 1973 Role of the kidney in the metabolism of luteinizing hormone J. Endocrinol 58:425-434[ISI][Medline]
Dirnhofer S., M. Hermann, A. Hittmair, 1996 Expression of the human chorionic gonadotropin-ß gene cluster in human pituitaries and alternate use of exon 1 J. Clin. Endocrinol. Metab 81:4212-4217[Abstract]
Drouin G., F. Prat, M. Ell, G. D. P. Clarke, 1999 Detecting and characterizing gene conversions between multigene family members Mol. Biol. Evol 16:1369-1390[Abstract]
Einspanier A., R. Nubbemeyer, S. Schlote, M. Schumacher, R. Ivell, K. Fuhrmann, A. Marten, 1999 Relaxin in the marmoset monkey: secretion pattern in the ovarian cycle and early pregnancy Biol. Reprod 61:512-520
Ezashi T., T. Hirai, T. Kato, K. Wakabayashi, Y. Kato, 1990 The gene for the beta subunit of porcine LH: clusters of GC boxes and CACCC elements J. Mol. Endocrinol 5:137-146[Abstract]
Faber J. J., K. L. Thornburg, N. D. Binder, 1992 Physiology of placental transfer in mammals Am. Zool 32:343-354[ISI]
Fiddes J. C., H. M. Goodman, 1980 The cDNA for the ß-subunit of human chorionic gonadotropin suggests evolution of a gene by readthrough into the 3'-untranslated region Nature 286:684-687[ISI][Medline]
Force A., M. Lynch, F. B. Pickett, A. Amores, Y. Yan, J. Postlethwait, 1999 Preservation of duplicate genes by complementary, degenerative mutations Genetics 151:1531-1545
Gillespie J., 1991 The causes of molecular evolution Oxford University Press, New York
Goldman N., Z. Yang, 1994 A codon-based model of nucleotide substitution for protein-coding DNA sequences Mol. Biol. Evol 11:725-736
Graham M. Y., T. Otani, I. Boime, M. V. Olson, G. F. Carle, D. D. Chaplin, 1987 Cosmid mapping of the human chorionic gonadotropin ß genes by field-inversion gel electrophoresis Nucleic Acids Res 15:4437-4448[Abstract]
Hasegawa M., H. Kishino, T. A. Yano, 1985 Dating of the human-ape splitting by a molecular clock of mitochondrial DNA J. Mol. Evol 22:160-174[ISI][Medline]
Hearn M. T. W., P. T. Gomme, 2000 Molecular architecture and biorecognition processes of the cystine knot protein superfamily: part I The glycoprotein hormones. J. Mol. Recognit 13:223-278
Hobson B. M., L. Wide, 1981 The similarity of chorionic gonadotrophin and its subunits in term placentae from man, apes, Old and New World monkeys and a prosimian Folia Primatol 35:51-64[ISI][Medline]
Hodgen G. D., L. G. Wolfe, J. D. Ogden, M. R. Adams, C. C. Descalzi, D. F. Hildebrand, 1976 Diagnosis of pregnancy in marmosets: hemagglutination inhibition test and radioimmunoassay for urinary chorionic gonadotropin Lab. Anim. Sci 26:224-229[Medline]
Jameson L., W. W. Chin, A. N. Hollenberg, A. S. Chang, J. F. Habener, 1984 The gene encoding the ß-subunit of rat luteinizing hormone: analysis of gene structure and evolution of nucleotide sequence J. Biol. Chem 259:15474-15480
King B. F., 1993 Development and structure of the placenta and fetal membranes of nonhuman primates J. Exp. Zool 266:528-540[ISI][Medline]
Kumar T. R., M. M. Matzuk, 1995 Cloning of the mouse gonadotropin ß-subunit encoding genes, II Structure of the luteinizing hormone ß-subunit-encoding genes. Gene 166:335-336
Lapthorn A., D. Harris, A. Littlejohn, 1994 Crystal structure of human chorionic gonadotropin Nature 369:455-461[ISI][Medline]
Li W.-H., 1997 Molecular evolution Sinauer Associates, Sunderland, Mass
Li M. D., J. J. Ford, 1998 A comprehensive evolutionary analysis based on nucleotide and amino acid sequences of the - and ß-subunits of glycoprotein hormone gene family J. Endocrinol 156:529-542
Luckett W. P., 1976 Cladistic relationships among primate higher categories: evidence of the fetal membranes and placenta Folia Primatol 25:245-276[ISI][Medline]
Lund L. A., G. B. Sherman, 1998 Duplication of the southern white rhinoceros (Ceratherium simum simum) luteinizing hormone ß subunit gene J. Mol. Endocrinol 21:19-30
Lynch M., J. S. Conery, 2000 The evolutionary fate and consequences of duplicate genes Science 290:1151-1155
Lynch M., A. Force, 2000 The probability of duplicate gene preservation by subfunctionalization Genetics 154:459-473
Malik H. S., S. Henikoff, 2001 Adaptive evolution of Cid, a centromere-specific histone in drosophila Genetics 157:1293-1298
Matzuk M. M., A. J. W. Hsueh, P. Lapolt, A. Tsafriri, J. L. Keene, I. Boime, 1990 The biological role of the carboxyl-terminal extension of human chorionic gonadotropin ß-subunit Endocrinology 126:376-383[Abstract]
Messier W., C.-B. Stewart, 1997 Episodic adaptive evolution of primate lysozymes Nature 385:151-154[ISI][Medline]
Morse J. H., G. Stearns, J. Arden, G. M. Agosta, R. E. Canfield, 1976 The effects of crude and purified human gonadotropin on in vitro stimulated human lymphocyte cultures Cell. Immunol 25:178-188[ISI][Medline]
Mossman H. W., 1987 Vertebrate fetal membranes Rutgers University Press, New Brunswick, NJ
Nilson J. H., J. A. Bokar, C. M. Clay, T. A. Farmerie, R. A. Fenstermaker, D. L. Hamernik, R. A. Keri, 1991 Different combinations of regulatory elements may explain why placenta-specific expression of the glycoprotein hormone -subunit gene occurs only in primates and horses Biol. Reprod 44:231-237[Abstract]
Nurminsky D. I., M. V. Nurminskaya, D. De Aguiar, D. L. Hartl, 1998 Selective sweep of a newly evolved sperm-specific gene in Drosophila Nature 396:572-575[ISI][Medline]
Ohno S., 1970 Evolution by gene duplication Springer-Verlag, New York
Policastro P. F., S. Daniels-McQueen, G. Carle, I. Boime, 1986 A map of the hCGß-LHß gene cluster J. Biol. Chem 261:5907-5916
Rasband W. S., D. S. Bright, 1995 NIH image: a public domain image processing program for the macintosh Microbeam Anal. Soc. J 4:137-149
Roberts R. M., R. V. Anthony, 1994 Molecular biology of trophectoderm and placental hormones Pp. 395440 in J. K. Findlay, ed. Molecular biology of the female reproductive system. Academic Press, San Diego
Sawyer S., 1989 Statistical tests for detecting gene conversion Mol. Biol. Evol 6:526-538[Abstract]
Sawyer S. A., 2000 GENECONV: statistical tests for detecting gene conversionversion 1.81 Department of Mathematics, Washington University, St. Louis, Mo
Sherman G. B., D. F. Heilman, A. J. Hoss, D. Bunick, L. A. Lund, 2001 Messenger RNAs encoding the ß subunits of guinea pig (Cavia porcellus) luteinizing hormone (gpLH) and putative chorionic gonadotropin (gpCG) are transcribed from a single-copy gpLH/CGß gene J. Mol. Endocrinol 26:267-280
Sherman G. B., M. Wolfe, T. Farmerie, C. Clay, D. Threagill, D. Sharp, J. Nilson, 1992 A single gene encodes the ß-subunits of equine luteinizing hormone and chorionic gonadotropin Mol. Endocrinol 6:951-959[Abstract]
Simula A. P., F. Amato, R. Faast, A. Lopata, J. Berka, R. J. Norman, 1995 Luteinizing hormone/chorionic gonadotropin bioactivity in the common marmoset (Callithrix jacchus) is due to a chorionic gonadotropin molecule with a structure intermediate between human chorionic gonadotropin and human luteinizing hormone Biol. Reprod 53:380-389[Abstract]
Steger D. J., J. Altschmied, M. Buscher, P. L. Mellon, 1991 Evolution of placenta-specific gene expression: comparison of the equine and human gonadotropin -subunit genes Mol. Endocrinol 5:243-255[Abstract]
Swofford D. L., 1998 PAUP*: phylogenetic analysis using parsimony (*and other methods) Version 4.0. Sinauer, Sunderland, Mass
Talmadge K., N. C. Vamvakopoulos, J. C. Fiddes, 1984 Evolution of the genes for the ß subunits of human chorionic gonadotropin and luteinizing hormone Nature 307:37-40[ISI][Medline]
Tepper M. A., J. L. Roberts, 1984 Evidence for only one ß-luteinizing hormone and no ß-chorionic gonadotropin gene in the rat Endocrinology 115:385-391[Abstract]
Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The Clustal X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882
Tullner W. W., 1974 Comparative aspects of primate chorionic gonadotropins Contrib. Primatol 3:235-257[Medline]
Virgin J. B., B. J. Silver, A. R. Thomason, J. H. Nilson, 1985 The gene for the ß subunit of bovine luteinizing hormone encodes a gonadotropin mRNA with an unusually short 5'-untranslated region J. Biol. Chem 260:7072-7077
Wang W., J. Zhang, C. Alvarez, A. Llopart, M. Long, 2000 The origin of the Jingwei gene and the complex modular structure of its parental gene, Yellow Emperor, in Drosophila melanogaster Mol. Biol. Evol 17:1294-1301
Yang Z., 1998 Likelihood ratio test for detecting positive selection and application to primate lysozyme evolution Mol. Biol. Evol 15:568-573[Abstract]
. 2000 Phylogenetic analysis by maximum likelihood (PAML) Version 3.0. University College London, London, U.K