(Received for publication, January 31, 1997, and in revised form, March 20, 1997)
From the University of Michigan, Department of
Anatomy and Cell Biology, Ann Arbor, Michigan 48109-0616, § Pharmacia and Upjohn Company, Molecular Biology Unit 7242, Kalamazoo, Michigan 49007 and the ¶ Department of Anatomy, Wayne
State School of Medicine, Detroit, Michigan 48201
One basis for the evolution of organisms is the
acquisition of new temporal and spatial domains of gene expression.
Such novel expression domains could be generated either by
cis sequence changes that alter the complement of
trans-acting regulators binding to control elements or by
changes in the expression patterns of one or more of the regulatory
(trans) factors themselves. The globin gene is a prime
example of a gene that has undergone a distinct change in temporal
expression at a defined time in evolution. Approximately 35-55 million
years ago, the previously embryonic
gene acquired a fetal
expression pattern. This change occurred in a simian primate ancestor
after the separation of simian and prosimian primates but before the
further separation of the major simian lineages; thus, the (prosimian)
galago
gene retains the ancestral embryonic expression pattern,
whereas the (simian) human
gene is fetal. This analysis of galago
and human
genes in transgenic mice demonstrates that
cis changes in sequences within a 4.0-kilobase region
surrounding the
gene were responsible for the evolution of a novel
fetal expression pattern in the
globin genes of simian
primates.
Reconstruction of the evolutionary history of the mammalian globin gene cluster indicates that in the common ancestor of marsupial
and placental mammals (135 MYA),1 two
-like globin genes existed, each with a different temporal expression pattern (1). This two gene cluster, 5
-
-
-3
, persists in present day marsupials; the
gene is expressed in embryonic life,
whereas the
gene is active postembryonically (1, 2). In early
placental mammals, however, prior to the mammalian radiation (80-100
MYA), a
duplication produced two postembryonic genes (
and
),
and
duplications produced three embryonic genes (
,
, and
)
(3, 4). Although gene duplications, gene inactivations, gene deletions,
and even whole locus duplications or triplications have further
modified the
globin loci of all present day eutherian mammals, a
clear relationship to this ancestral five-member cluster can still
be appreciated (3). Moreover, these five genes have, for the most part,
retained their ancestral programs of stage specificity; the
gene is
embryonic, and the
gene is postembryonic in all extant eutherian
mammals. However, some lineage-specific alterations in the temporal
expression of globin genes have occurred. An important example is the
globin gene; originally embryonic in its expression pattern, the
gene was recently recruited to be a fetal gene in anthropoid
(simian) primates (5).
All simian genes studied to date are expressed fetally, whereas
prosimian
genes retain the embryonic pattern characteristic of
other (nonprimate) eutherian mammals (4-7). Therefore, acquisition of
fetal specificity can be traced to a relatively narrow evolutionary window (35-55 MYA), after the separation of simian (catarrhine and
platyrrhine) from prosimian (galagos and lemurs) primates but prior to
the divergence of platyrrhine (New World Monkeys) from catarrhine (Old
World Monkeys, apes, and human) primates (4, 5). Two other events can
be traced to this same evolutionary window: duplication of the
gene
and a burst of base substitutions that occurred both in the promoter
region of the
globin gene and in coding regions (5). Amino acid
substitutions resulting from the coding region changes led to loss of
2,3-diphosphoglycerate binding ability, resulting in a fetal hemoglobin
molecule that could bind oxygen with increased affinity and thus
facilitate the transfer of oxygen from mother to fetus. Both the
promoter and coding region base substitutions were subsequently fixed
during further evolution of platyrrhine and catarrhine primates. This pattern of accelerated base substitution followed by decelerated rates
of substitutions in the same regions has been considered indicative of
the spreading and subsequent preservation of adaptive substitutions
(8-10).
Although these three molecular events (fetal recruitment, gene
duplication, and the burst of promoter substitutions) cannot be
temporally ordered based on current phylogenetic evidence, one possible
scenario is that the duplication of the gene provided a redundant
substrate for the accumulation of base changes that altered
stage
specificity (6, 11). Once such a base change was collected, perhaps in
the duplicate
gene at greater distance from the LCR (A
or
2),
they may have been selected for and subsequently transferred to the
other gene (G
or
1) by gene conversion. Indeed, evidence exists
for gene conversions of this polarity (11, 12). Implicit in this
hypothetical scenario is the assumption that cis mechanisms
were responsible for the fetal recruitment of the
gene, an
assumption that has not been definitively tested. To examine this
question directly, transgenic mice were generated in which expression
of the galago
(embryonic) and the human
(fetal) genes could be
compared. Analysis of the stage-specific expression of these two
genes in the mice reveals distinctly different patterns; galago
gene expression is embryonic and is silenced in the fetal liver,
whereas human
gene activity peaks in fetal life. cis
differences in the
gene fragments must therefore direct these
different expression patterns.
The galago fragment used spans sequences 10508-14995 of GenBankTM entry M73981[GenBank]
of the Galago crassicaudatis
globin cluster. The human
fragment corresponds to sequences 38084-42140 of the human globin
cluster (GenBankTM HUMHBB) and includes the A
3
enhancer region
(13). The galago
gene contains several small insertions not present
in the human gene, accounting for the slight size difference in these
two fragments (4057 and 4487 bp for the human and galago
gene,
respectively). Human
sequences and HS3 sequences used for both
constructs correspond to HUMHBB coordinates 3267-5172 (HS3) and
17841-21241 (
).
Insert DNA was purified away from vector sequences prior to injection of the constructs for production of transgenic mice. Purified DNA fragments were microinjected into F2 hybrid zygotes from C57BL/6J X SJL/J parents at a concentration of 2-3 ng/µl. Injections were done by the Transgenic Animal Model Core in the University of Michigan Biomedical Core Research Facility. All procedures using mice were approved by the University of Michigan Committee on Use and Care of Animals, and all work was conducted in accord with the principles and procedures outlined in the National Institutes of Health Guidelines for the Use and Care of Experimental Animals. Four founder animals were identified for each construct and were mated to CD-1 females to acquire F1 males that could be used in breeding for all experimental time points. Timed matings were done to obtain F2 (in some cases, F3) conceptuses for S1 analysis.
DNA AnalysisDNA for polymerase chain reaction and Southern
analysis was purified from tails of founders or F1 and from
the heads of F2 or F3 fetal and embryonic
conceptuses. Polymerase chain reaction primers consisted of
5-AGCTGCTGCAGTCAAAGTCGAATGCAGCTG and 5
-TCCATCCATTTCTACCATTTCTTTCTCCTA and detected the boundary between the upstream
region and HS3. For
determination of copy number and transgene integrity, Southern blots
were probed with a 0.4-kb HindIII/BamHI fragment
corresponding to the 5
end of the 1.9-kb HindIII HS3
fragment used in both constructs.
RNA was extracted from 10.5-day yolk sacs, or
from fetal liver of 12.5-, 14.5-, and 16.5-day conceptuses (the morning
on which the plug was detected was considered day 0.5). Tissues were
dissected and immediately frozen in liquid nitrogen prior to
processing. Isolation of RNA was accomplished using Trizol (Life
Technologies, Inc.) according to the manufacturer's directions. RNA
was quantitated spectrophotometrically and was analyzed on agarose gels
to assess integrity. To quantitate mRNA levels, S1 nuclease
protection was used according to published protocols (14). S1 nuclease
probes for the detection of human and mouse mRNAs were kindly
provided by Dr. Timothy Ley and have been described earlier (14, 15). The galago S1 probe corresponded to a 435-bp
XbaI/BamHI genomic fragment labeled at the
BamHI site in exon 2. The protected fragment was 204 bp.
Quantitation of the signals from S1 analysis was accomplished using a
PhosphorImager with ImageQuant software.
Two related constructs (of structure HS3--
) were introduced
into transgenic mice (Fig. 1A). The degree of
homology of the two
fragments used is illustrated in Fig.
1B. The native galago
-like globin cluster contains a
single
gene, whereas the human cluster contains two
genes (5).
The human A
(
2) gene, including its 3
enhancer (13), was used in
these constructs. The galago
gene contains sequences similar to the
human A
enhancer as indicated in the matrix plot of homology shown
in Fig. 1B; however, the regulatory function, if any, of
this region of the galago
gene has never been tested.
LCR sequences are necessary for the high level expression of human
transgenes in the murine background (16), but single DNaseI
hypersensitive sites (HS) within the LCR can also confer this property
(17-20). Because it had been demonstrated that HS3 could impart high
level, copy number dependent expression to a human transgene, a 1.9-kb
HindIII fragment spanning this region was included in both
constructs (17, 19). In addition, earlier data indicated that of all of
the hypersensitive sites, HS3 may be uniquely able to drive expression in the fetal liver (20). To provide a standard against which
to compare expression of the human and galago
genes, the human
gene (
2000 to +1780) was also included in both constructs. Earlier
studies had shown that this
fragment is expressed in the embryonic
yolk sac and silenced autonomously in the fetal liver (21, 22).
Transgene copy number (Table I) and integrity (not
shown) were assessed in Southern blots of tail DNA. For each construct, transgenic males from four independent lines were bred to obtain embryonic and fetal tissues. Table I summarizes the copy number corrected expression levels for and
transgenes in all eight lines examined relative to total mouse
chains. S1 nuclease analysis of
and
expression in a representative transgenic line carrying each type of
gene is shown in Fig. 2.
|
All eight transgenic lines expressed both the and
transgenes.
However, line to line variation in transgene expression level was
observed, most likely due to position effects. Thus, expression was not
copy number-dependent despite the fact that both constructs
contained the region of HS3 recently shown to possess dominant
chromatin opening function (23). Significant position effects with
HS3A
transgenes (but missing the A
enhancer) have also been
observed by others (24). Interestingly, all four HS3-
-gal
lines and three of four
HS3-
-hum
lines exhibit an inverse relationship
between copy number and expression (Table I). This pattern has been
observed previously with HS2-containing constructs (25), but the
significance of this phenomenon is presently unclear.
Although expression levels varied, patterns of transgene expression
during development were highly reproducible for each gene as
illustrated in Fig. 3 where expression at each time
point is plotted relative to the 10.5-day expression level (which is
taken as 100%). In both HS3--hum
and
HS3-
-gal
lines, the human
gene was expressed at
high levels in the embryonic yolk sac (day 10.5 and 12.5) and was
significantly repressed in 14.5 and 16.5 day fetal livers (Figs. 2 and
3). The yolk sac portion (10.5 and 12.5 days) of the
expression
curves in HS3-
-hum
lines were somewhat more variable
than those seen in HS3-
-gal
lines. However, the well
known variability in the timing of development of conceptuses even
within the same litter makes it difficult to determine if these
differences are significant. Nevertheless, the fetal portion (14.5 and
16.5 days) of the
expression curves was identical in mice carrying
both constructs; the human
gene was silenced in fetal life.
In contrast, the two genes exhibited distinctly different
expression patterns in the fetal liver (Figs. 2 and 3). The galago
gene was expressed at highest levels in embryonic life and silenced along with the human
gene by 14.5 days, mimicking the embryonic pattern characteristic of the galago (5). Interestingly, the developmental expression curves for human
and galago
in each line were nearly superimposed, suggesting that the two genes were coordinately silenced. In contrast, the human
gene was not
coordinately silenced with
; rather, expression peaked in 14.5 day
fetal livers and declined at 16.5 days. Although expression curves were
somewhat variable in shape, considerable
expression was still
observed at 16.5 days, a pattern distinctly different than that seen
for the galago
gene.
Examination of 10.5-day expression levels for all transgenes (Table I)
reveals that in HS3--hum
lines,
gene expression
was greater than
gene expression (average 1.4-fold). This pattern
(
<
) has been seen by others when larger constructs containing
the human
and
genes were studied in the mouse (26, 27). In
contrast, in HS3-
-gal
lines, human
expression was
greater than galago
expression (3.3-, 6.4-, 35.7-, and 2.8-fold for
the four lines).
These data indicate that the characteristic embryonic expression
pattern of the galago gene can be recapitulated in the transgenic
mouse. It has also been demonstrated that globin genes from the chicken
(28) and frog (29) are expressed in the mouse background in temporal
patterns similar to those expected on the basis of in vivo
patterns. Together, these studies attest to the broad evolutionary
conservation of cis and trans regulators of globin gene expression.
The work presented here demonstrates that human and galago transgenes exhibit different developmental expression patterns when
linked to the same portion of the LCR and when placed in the same
microenvironment (mouse fetal liver). The divergent expression patterns
of the two genes must therefore be due to differences in DNA sequence
(cis elements) within the 4.0-kb fragment that contains the
gene. Thus, fetal recruitment of the simian
gene was (at least
in part, see below) a cis-mediated event. Moreover, this
result confirms that cis signals for stage-specific globin gene expression must reside near the genes, not within the LCR. Earlier
studies of human transgene expression in the absence of LCR sequences
also support this conclusion (30).
The data also eliminate distance from the LCR as a determinant
per se of the differences in stage-specific gene expression of these two genes (31, 32). The physical distance between the gene(s) and the LCR in the intact
-like globin loci of human and
galago differ significantly; the single galago
gene is 13.5 kb from
the 3
end of the LCR (the HS1 core), whereas the human G
and A
genes are 21 and 26 kb away, respectively. In the constructs studied
here, both
genes were equidistant from HS3; however, their
characteristic expression patterns were preserved. It is nevertheless
possible that the increased distance of the duplicated human
gene
from the LCR may have played a permissive role in the initial evolution
of a new fetal expression pattern (6, 11).
The conclusion that a fetal liver trans environment that is
permissive for expression had already evolved prior to the
mammalian radiation is supported by data presented here and elsewhere
(24, 26, 27, 33). This does not imply that the mouse fetal liver environment is identical to that of the human; differences may exist in
the relative balance of trans factors that would result in
some distinct patterns of regulation in each species. Indeed, when the
human
gene is placed in the context of the entire
-like globin
locus, it seems to be silenced at an earlier developmental time in the
mouse fetal liver than in the human fetal liver (26, 27). Regardless of
these differences, the data presented here indicate clearly that
cis differences exist between galago and human
genes
that result in the generation of distinct patterns of expression in the
fetal mouse liver; the galago
gene is silenced, whereas human
gene expression peaks in this stage.
Interestingly, in several independent lines carrying the
HS3--gal
construct, the kinetics of galago
and
human
gene silencing after embryonic life were nearly identical.
Such coordinate regulation could be a consequence of lineage
restriction. That is, both genes may be expressed at high levels only
in yolk sac derived "primitive" erythrocytes and not in fetal
liver-derived "definitive" erythrocytes. Whether there are actually
two different stem cell lineages that contribute progeny to primitive
and definitive lineages is still a matter of some debate, but recent
identification of an intraembryonic source of long term repopulating
hematopoietic cells suggests that this is likely (reviewed in Ref. 34).
Coordinate regulation of human
and galago
genes could be
achieved by the presence of silencers that act on both genes in
definitive cells or by the absence of primitive activators in
definitive cells. Alternatively, lineage specific changes in chromatin
structure may explain the coordinate silencing of these two genes. The
human
gene but not the human
or galago
genes may contain
elements that allow it to be expressed in the progressively
heterochromatic environment of the definitive cell.
In order for the simian gene to complete the transition from an
exclusively embryonically expressed gene (the galago
pattern) to a
primarily fetally expressed gene (the human
pattern), a second
anthropoid-specific change is required: reduction of embryonic expression levels. This could have been accomplished by cis
alterations that created binding site(s) for anthropoid-specific
embryonic repressor(s) or by trans changes (loss of an
embryonic activator of the
gene specifically in anthropoid
primates). Both scenarios imply that the trans environment
of the mouse yolk sac must differ from that of the human and other
anthropoid primates. Because
globin gene expression has only been
studied in relatively few anthropoid primates, it is possible that
further analysis will reveal a species in which the
gene is
expressed at high levels in both embryonic and fetal life.
The constructs described here should facilitate the identification of
the specific cis sequence change(s) that mediated fetal expression, information that will likely reveal the molecular mechanisms responsible for acquisition of this new temporal expression domain. Several possible mechanisms exist, and a few candidate cis elements have already been identified. First, nucleotide
changes could have resulted in the loss of fetal-specific repressor
binding site(s) in the ancestral simian
gene; a region near the
proximal CCAAT box shows anthropoid-specific base changes that reduce
the binding of a complex of putative fetal repressor proteins (35). Second, base changes could have generated simian-specific activator motif(s); anthropoid-specific changes in the
1086 region alter a YY1
binding site that appears to be important for the activation of
in
the fetal stage.2 Third, the gain of a
binding site for a fetal stage selector protein (SSP; Ref. 36) may have
given the
gene a competitive edge over the
gene in fetal life.
In the
50 region of the human
promoter, several
anthropoid-specific nucleotides comprise a binding site for SSP; the
SSP site is absent in the galago
gene (36). Finally, fetal
expression could have arisen via acquisition of a new interaction
between the
promoter and the LCR that is stable in the fetal stage.
In this regard, it is of interest that in HS3-
-gal
lines,
>
and the two genes are coordinately silenced, but in
HS3-
-hum
lines,
>
and silencing is not
coordinate. Establishment of a strong LCR contact that is stable in
fetal life would not only accomplish the fetal recruitment of
but
could conceivably force a delay in the expression of
via
competitive mechanisms. Indeed, it has been demonstrated that the
galago
gene is activated in early fetal life, whereas human
gene activation occurs at birth (5). Identification of the exact
cis sequences that mediated the different expression
patterns of the
genes observed in this study is likely to further
our understanding of the molecular mechanisms that control the
evolution of novel stage-specific expression domains as well as the
regulation of hemoglobin switching.
We thank Drs. Sally Camper, Linda Samuelson, and Kevin McDonagh for comments and suggestions, J. Lloyd for the HS3 fragment, T. Ley for clones used as S1 probes, and the University of Michigan Transgenic Mouse Core for the production of transgenic lines.