(Received for publication, July 1, 1994; and in revised form, November 22, 1994)
From the
Extensive analyses of homeobox gene expression and function during murine embryogenesis have demonstrated that homeobox gene products are key components in the establishment of pattern formation and regional identity during development. In this paper we report the molecular characterization and expression of a novel murine homeobox sequence, Hesx1, isolated from pluripotent embryonic stem cells. Hesx1 is expressed as two transcripts of 1.0 and 1.2 kilobases which encode an identical 185 amino acid open reading frame. The transcripts differ in the 3`-untranslated region due to the differential utilization of a weak splice donor site located immediately downstream of the translation termination codon. The Hesx1 homeodomain shared 80% identity with the Xenopus homeoprotein XANF-1 and was less than 50% related to other homeodomain sequences. Hesx1 and XANF-1 therefore constitute the founder members of a new homeodomain class. Hesx1 expression was down-regulated during embryonic stem cell differentiation and was detected in tissue-specific RNA samples derived from the embryonic liver, and at lower levels in viscera, amnion, and yolk sac. Expression in adult mice was not detected. These sites of expression are consistent with a role for Hesx1 in the regulation of developmental decisions in the early mouse embryo and during fetal hematopoiesis.
Homeobox genes encode a 60-amino-acid conserved DNA-binding domain, termed the homeodomain, which mediates specific interaction between homeodomain proteins and DNA. Specific residues within the homeodomain primary sequence are highly conserved and are critical for homeodomain stability and DNA interaction (reviewed in (1) ). Homeodomain sequences are divided into classes on the basis of additional sequence homology across the homeodomain(2, 3) . Homeobox genes within a class are usually greater than 70% homologous across the homeodomain, while sequence identity between classes is generally less than 50%.
Of the 60 or more murine homeobox genes which have been identified, the developmental role of the Hox genes is best understood. The 38 Hox genes, which belong to the Antennapedia sequence class, are arranged into four related clusters each containing about 10 genes which lie in the same transcriptional orientation(4) . Mutational analyses of Hox gene function during development, mainly using gene knockout strategies, have demonstrated that Hox genes coordinate pattern formation and specify regional identity in the paraxial mesoderm, hindbrain, and limbs (reviewed in (4) ). At the cellular level, it has been shown that overexpression of Hox 2.4 (Hoxb-7) in myeloid progenitor cells results in inhibition of differentiation(5) , supporting a role for homeobox genes in the specification of cellular identity. The remaining murine homeobox genes belong to a variety of homeodomain classes and are not located within genomic clusters. In most cases, a developmental role for these homeobox genes is implied by their spatially restricted and transient expression during embryogenesis(6, 7, 8, 9) . This has been supported by mutational analysis(10) .
Relatively little is known about expression and developmental function of homeobox genes in the pluripotent cells of the early murine embryo. Homeobox genes expressed by these cells are of particular interest because they are likely to regulate the complex developmental behavior of these cells. Cellular decisions such as proliferation, differentiation, and lineage specification must be coordinated precisely through the appropriate action of specific transcription factors during embryogenesis. Homeobox genes expressed by pluripotent cells may also provide molecular markers for the recognition of these cells in the murine embryo and in other mammalian species.
Embryonic stem (ES) ()cells, which are derived from the pluripotent cells of the
early murine embryo(11, 12) , can be maintained in
vitro in the undifferentiated state by culture in the presence of
the cytokine DIA/LIF(13, 14, 15) , or
directed to differentiate along alternative pathways by withdrawal of
DIA/LIF or by chemical induction(16) . The ability of ES cells
to reintegrate into the blastocyst and to contribute to all of the
tissues in the developing embryo indicates that ES cells cultured in vitro retain pluripotence. In a previous study of homeobox
gene expression in ES cells, we identified a novel homeobox sequence
termed Hesx1 (formerly HES-1; 17). Here we report the
isolation of Hesx1 cDNA and genomic clones and analyze the
expression of this gene during ES cell differentiation and murine
development.
Figure 1: Nucleotide sequence of the Hesx1a (A) and Hesx1b cDNA clones. Nucleotide sequence common to the Hesx1a and Hesx1b clones spans nucleotide positions -65 to 560 and encodes a 185 amino acid open reading frame from position 1 to 558. The additional 13 nucleotides at the 5` end of the Hesx1b clone which contain an in frame stop codon (underlined) are shown in bold. Nucleotide sequence within the boxed region corresponds to the homeobox. Positions of introns within the genomic clone are indicated by arrows. The 3` unique sequences of the Hesx1b clone, which correspond to nucleotides downstream of position 560 in Hesx1a, are shown in B. The 12 consecutive CA repeats in the Hesx1b transcript are indicated by a bold underline. The Hesx1a and Hesx1b cDNA clones were sequenced in both directions.
The generation of
radioactive riboprobes for RNase protection and Northern blot analysis
was performed using the method of Krieg and Melton(24) . RNase
protection and Northern blot probe transcription reactions contained
240 µCi of [-
]PrUTP and 60 µCi of
[
-
]PrUTP (Bresatec Ltd.), respectively.
Unincorporated radioactive label was separated from reaction products
by loading transcription products onto a Sephadex G-50 column and
centrifugation at 3,200 revolutions/min for 4 min. RNase protection
reactions were carried out using the protocol described by Krieg and
Melton (24) except that 100,000 counts/min of single-stranded
probe was added to each RNA sample. RNase digestion products were
separated on a 7 M urea, 5% polyacrylamide gel and visualized
using autoradiography by exposure to Konica medical grade x-ray film
with intensifying screens at -80 °C for 4-10 days, or
PhosphorImager analysis (Molecular Dynamics, ImageQuant software
package). For Northern blot analysis, (A)
RNA was isolated from
MBL-5 ES cell cytoplasmic RNA using oligo (dT) cellulose beads (Sigma).
Approximately 10 µg of (A)
RNA was electrophoresed on a 1.3% agarose gel containing 1
MOPS buffer (23 mM MOPS pH 7.0, 50 mM sodium
acetate, 10 mM EDTA) and 1.1% formaldehyde at 6 V/cm gel
length in 1
MOPS buffer. After electrophoresis RNA was
transferred to Nytran membrane (Schleicher & Schuell) by capillary
blotting and the RNA immobilized by UV cross-linking and baking at 80
°C in vacuo for 2 h. Filters were prehybridized in a
solution containing 5
SSC, 60% formamide, 0.1% bovine serum
albumin, 0.1% Ficoll 400, 0.1% polyvinylpyrrolidone, 20 mM sodium phosphate, pH 6.8, 1% SDS, 100 µg/ml sonicated salmon
sperm DNA (Sigma), and 100 µg/ml denatured tRNA (Sigma) for
4-24 h at 65 °C. Filters were probed for 16 h at 68 °C
and washed in 2
SSC, 1% SDS at 50 °C for 3
15 min
and then in 0.2
SSC, 1% SDS at 75 °C for 1 h. Northern blot
filters were exposed to Konica medical grade x-ray film with
intensifying screens at -80 °C for 7 days.
Figure 2:
The Hesx1a and Hesx1b cDNAs are derived
from the 1.0- and 1.2-kb Hesx1 transcripts, respectively. A, Northern blot analysis of approximately 10 µg of (A) MBL-5 ES cell RNA probed
with an antisense Hesx1 riboprobe derived from the common
Hesx1a and Hesx1b sequences. B and C, RNase
protection analysis of 20 µg of undifferentiated MBL-5 ES cell RNA
was carried out using antisense riboprobes spanning positions
26-628 and 26-770 of the Hesx1a (B) and Hesx1b (C) cDNAs, respectively. Hesx1 bands derived from a
common transcript are connected by dashed lines. D,
molecular origin of the Hesx1a and Hesx1b riboprobes. Wide boxes represent Hesx1 transcripts, and corresponding riboprobes
are shown as narrow boxes above. 3` unique sequences are
indicated by diagonal shading.
The relative sizes of the Hesx1a and Hesx1b clones suggested that they were derived from the 1.0- and 1.2-kb transcripts, respectively. This was confirmed by comparison of RNase protection products generated by antisense riboprobes derived from the 3` ends of the two cDNAs (Fig. 2). The most abundant Hesx1 transcript expressed by undifferentiated ES cells was the 1.0-kb transcript (Fig. 2A). RNase protection using an antisense riboprobe spanning positions 26-628 of the Hesx1a clone (Fig. 2, B and D) generated a more intense 602 bp band (resulting from protection across the entire probe), and a less intense 532 bp band (resulting from protection across the common cDNA sequence). The greater relative intensity of the 602 bp band indicated that the Hesx1a clone was derived from the more abundant 1.0-kb transcript. The reciprocal experiment, using a Hesx1b antisense probe (Fig. 2D), supported this conclusion (Fig. 2C). The Hesx1a and Hesx1b cDNAs therefore correspond to the 1.0- and 1.2-kb Hesx1 transcripts, respectively.
An identical 185 amino acid open reading frame,
containing the Hesx1 homeodomain (amino acids 108-167),
was identified in the Hesx1a and Hesx1b nucleotide sequences (boxed in Fig. 1A). The identity of the initiation codon
for this open reading frame was confirmed by the presence of an
upstream in-frame stop codon at position 7 in the Hesx1b transcript, underlined in Fig. 1A. This termination codon
is thought to be common to both Hesx1 transcripts although the
5` end of the Hesx1a transcript was not identified in this analysis.
The termination codon for the Hesx1 open reading frame was
located immediately upstream of the sequence divergence between the Hesx1 transcripts. The alternative transcripts therefore
encode identical proteins and differ only in the 3`-untranslated
region. A (CA) repeat sequence containing 12 direct
dinucleotide repeats was located within the unique 3`-untranslated
region of the Hesx1b sequence (Fig. 1B, positions
956-979 (bold underline)). (CA)
repeat
sequences in which n > 10 are usually polymorphic with
respect to the number of repeats and have been used to generate a
genetic map of the murine genome on the basis of this variation in
repeat sequence length(27) .
Sequence comparison showed that the Hesx1 homeodomain shares 80% identity with the Xenopus homeobox gene XANF-1 (28) but was not >50% related to other homeodomain sequences (Fig. 3A). This identifies Hesx1 and XANF-1 as founding members of a novel homeodomain class as has previously been suggested from analysis of partial homeodomain sequences(17) . Comparison between the entire Hesx1 and partial XANF-1 open reading frames revealed additional structural similarities. Hesx1 and XANF-1 are identical at the 8 upstream and 6 downstream amino acids which flank the homeodomain and terminate 15 amino acids downstream of the homeodomain (Fig. 3B). Apart from the sequence identity within and flanking the homeodomain, a stretch of 34 amino acid residues at the amino terminus of the Hesx1 open reading frame shares 56% homology with XANF-1 (Fig. 3B). Little homology was detected in the remaining regions of the Hesx1 and XANF-1 primary sequences. The similarities in primary structure suggest that the Hesx1 and XANF-1 genes may have evolved from a common ancestral gene.
Figure 3:
Sequence comparison of the Hesx1 homeodomain. A, the positions of the three -helices
and homeodomain consensus sequence (2) are shown at the top of
the figure. Consensus Hesx1 residues are shown in bold. Amino acid residues which are conserved in the Hesx1, XANF-1 (28) and Antennapedia(29, 30) homeodomains are indicated by dashes. The percentage homology with Hesx1 is shown
on the right. B, comparison of the Hesx1 and XANF-1 open reading frames. Positions of identity are
indicated by vertical lines, and homeodomain sequences are
shown in boldface. Numbers on the right refer to
amino acid positions in the Hesx1 (Fig. 1) and XANF-1 open reading
frames.
Figure 4: Hesx1 genomic Southern blot. Ten µg of murine genomic DNA isolated from MBL-5 ES cells was digested with PstI/BamHI (P/B) and EcoRI/BamHI (E/B) and probed with the Hesx1 partial homeodomain sequence(17) .
Figure 5: A, Hesx1 genomic restriction map and intron/exon structure. The restriction map was deduced from Southern blot analysis of murine genomic DNA and the 16-kb Hesx1 genomic clone. Genomic regions which contained exons were subcloned and sequenced and are indicated by arrows. Boxed regions represent Hesx1 exons. The 5` boundary of exon I has not been identified and is represented by the dashed line. The length of each intron is indicated in italics in parentheses. Light shading represents the Hesx1 open reading frame and darker shading the homeobox region. The alternative splice site, which is located within exon IV, is indicated by an asterisk. The exact positions of intron/exon boundaries are shown in Fig. 1. H = HindIII, E = EcoRI, P = PstI. B, derivation of the 1.0-kb (Hesx1a) and 1.2-kb (Hesx1b) transcripts. The 1.2-kb (Hesx1b) transcript results from splicing between the weak splice donor site (asterisk) in exon IV to the alternative downstream exon V. The 1.0-kb (Hesx1a) transcript terminates within exon IV.
To allow further characterization of the Hesx1 locus, and to identify the genomic origin of the Hesx1 transcripts, a BALB/C murine genomic library was screened and a 16-kb Hesx1 genomic clone was isolated. The genomic structure of the Hesx1 locus deduced by comparison of the Hesx1 cDNA and genomic sequences is shown in Fig. 5A. Three introns common to both the Hesx1a and Hesx1b transcripts were identified within the Hesx1 genomic sequence. Each of these was surrounded by conserved splice acceptor and splice donor sequences (Table 1) as defined by Mount(31) . Two of these introns were located within the homeobox itself. Although the presence of a single intron within a homeobox is not uncommon, few examples of two introns in this region have been reported, and Hesx1 is the first mammalian homeobox gene for which this has been described.
The molecular origin of the alternative Hesx1 transcripts could also be deduced from the genomic sequence. The 3` end of the Hesx1a transcript was contiguous with the Hesx1 open reading frame, while production of Hesx1b required a splicing event between a poor consensus donor splice site (Fig. 5C, Table 1) located at the point of Hesx1a/Hesx1b divergence within exon IV, and an alternative exon V (Fig. 5). Thus, the alternative Hesx1 transcripts are derived from differential utilization of this weak donor splice site. The biological roles of the alternative transcripts, which differ only in the 3`-untranslated regions, are unknown. Southern blot and sequence analysis indicated that exon V was located 663 bp downstream of the alternative splice site and that the entire 3` end of the Hesx1b cDNA was contained within this exon, although the 3` end of this transcript has not yet been identified. The 5` end of the Hesx1 transcripts has not been determined.
RNase protection on uninduced and
induced ES cell populations (Fig. 6) was carried out using the
Hesx1a riboprobe derived from the pHESa construct (see ``Materials
and Methods''). Hesx1 was expressed at highest levels in
uninduced ES cells and was down-regulated during ES cell
differentiation. Terminal differentiation of ES cells with retinoic
acid resulted in massive down-regulation of Hesx1 expression,
while ES cell differentiation induced by MeSO or
3-methoxybenzamide, or by withdrawal of DIA/LIF, caused a reduction in Hesx1 expression. Residual Hesx1 expression in these
cultures is thought to result from incomplete differentiation and the
consequent persistence of stem cells within the differentiated
population. (
)Some variation in the profile of Hesx1 transcription during ES cell differentiation has been observed
(data not shown) which may be due to variable differentiation within
different cultures. In each case, however, there is a down-regulation
of Hesx1 transcription with cell differentiation. The ratio of
the 1.2- and 1.0-kb Hesx1 transcripts was the same in the
uninduced and the spontaneous, 3-methoxybenzamide- and
Me
SO-induced ES cells, with the 1.0-kb transcript the more
abundant mRNA species.
Figure 6: Hesx1 expression in undifferentiated and differentiated ES cells. RNase protection analysis of Hesx1 transcripts expressed by undifferentiated and differentiated ES cells. Protections were carried out on 20 µg of total RNA using the Hesx1a antisense riboprobe (Fig. 2) and a murine glyceraldehyde-3-phosphate dehydrogenase (mGAP) loading control. ES, undifferentiated ES cells; Sp, differentiation induced by LIF withdrawal; low RA and high RA, induction with 2 and 10 µM retinoic acid, respectively; MB, induction with 3-methoxybenzamide; DM, induction with dimethyl sulfoxide.
Figure 7: Hesx1 expression in embryonic and adult tissues. A, embryonic samples were isolated from day 14.5 post coitum MF1 (outbred albino) embryos. Ca, calvaria; Vi, viscera; Am, amnion; Gu, gut; Lu, lung; Li, liver; YS, yolk sac; He, heart; Br, brain. ES, MBL-5 ES cells; A Li, adult liver. Each reaction contained 10 µg of RNA. B, total embryonic RNA was isolated from day 10.5, 12.5, and 16 post coitum CBA embryos. Embryonic samples from day 16 post coitum embryos are abbreviated as follows: Liv, liver; Ki, kidney; He, heart; Lu, lung; Lim, limbs; In, intestine; Br, brain; and Sk, skin. ES, MBL-5 ES cells. Each protection reaction contained 20 µg of RNA. RNase protections were carried out as described in Fig. 6. Hesx1a* indicates the Hesx1a transcript from the polymorphic allele (see text).
In all embryonic tissues in which Hesx1 expression was detected, the lower 532 bp band corresponding to the Hesx1b transcript was absent and a smaller protected product of approximately 490 bp (Hesx1a*) was detected. Expression of the 490 bp band results from a Hesx1 sequence polymorphism in MF1 and CBA strain mice which is not present in the inbred C129 strain mice from which the ES cells are derived. The absence of a 532 bp band in these samples showed that the 1.2-kb transcript (Hesx1b) was not expressed, indicating that expression of the two Hesx1 transcripts is regulated differentially in different cell types.
Analysis of the murine homeobox gene Hesx1 revealed that the gene is expressed as two distinct transcripts of 1.0 and 1.2 kb which can be visualized by Northern blot. cDNA clones corresponding to these transcripts, and a 16-kb Hesx1 genomic clone, were characterized, and the molecular difference between the Hesx1 transcripts was shown to arise from differential usage of a poor consensus splice site located immediately downstream of the open reading frame termination codon. The alternative regulation of these transcripts in murine ES cells and embryonic tissues suggests that there may be alternative biological roles for the different transcripts which encode identical proteins. It is interesting that we were only able to detect expression of the 1.2-kb Hesx1 transcript in undifferentiated ES cells and not in middle to late stages of murine embryogenesis.
Comparison between the Hesx1 and partial XANF-1 open reading frames revealed 47% homology across the entire Hesx1 open reading frame including 80% identity within the homeodomain, suggesting that these genes may be sequence homologues. The regionalized homology between Hesx1 and XANF-1 may reflect the evolutionary conservation of protein sequences which have a functional role, in particular DNA binding activity within the homeodomain. Murine and Xenopus homologues of the goosecoid(6, 33) and Nkx2.5(34, 35) homeobox genes have been isolated which have 78 and 62% identity, respectively. Evidence based on gene expression and function suggests that the goosecoid and Nkx-2.5 homologues fulfill similar roles in these different species during development. There is insufficient experimental data regarding Hesx1 and XANF-1 expression and function to determine whether these related genes might play similar developmental roles during murine and Xenopus embryogenesis.
The presence of introns within the homeobox is a feature of many homeobox genes although it is interesting that introns are not found in the numerous homeobox sequences belonging to the Antennapedia class. The Hesx1 genomic structure is unusual in that it contains two introns within the homeobox. This is a relatively rare arrangement which has not been previously described in vertebrates. The position of the first intron is shared with the homeobox gene mec1, while the location of the downstream intron, within helix 3 of the homeobox, is shared with a variety of homeobox genes(36) . The regular occurrence of introns within the homeobox in a variety of species and genes, and the lack of conservation of the intron position, appear to conflict with the domain theory of intron-exon organization(37) .
Analysis of Hesx1 levels in embryonic tissues indicated that Hesx1 was expressed during 14.5 days and 16 days post coitum of embryogenesis and at highest levels in the embryonic liver. During embryogenesis the liver first becomes apparent around 11.5-12 days post coitum, and replaces the yolk-sac as the principal site of hematopoiesis with concomitant establishment of the definitive red blood cell lineage(38) . The hematopoietic function of the liver continues until about 18 days post coitum, when the spleen and subsequently the bone marrow take over this role. The restricted Hesx1 expression in embryonic liver, coupled with the absence of Hesx1 expression in adult mouse liver which does not have an hematopoietic function, suggest that Hesx1 might be expressed by cells of the hematopoietic lineages during embryogenesis. The levels of Hesx1 expression within the day 16.5 post coitum embryonic liver were similar to the levels expressed by undifferentiated ES cells in vitro. This could reflect generalized expression in all liver cells or elevated Hesx1 expression levels in a subpopulation of embryonic liver cells such as specific hematopoietic lineages. It will therefore be of interest to establish the expression pattern and function of Hesx1 during hematopoiesis. The lower levels of Hesx1 expression detected in the amnion and yolk sac may be due to the presence of hematopoietic stem cells which are believed to migrate from the wall of the yolk sac to the liver to establish hematopoietic foci(38) . Hesx1 expression was also detected at low levels in the viscera which is comprised of the embryonic tissues which remain after removal of the major organs. The detection of Hesx1 expression in the viscera may be due to the presence of hematopoietic cells within these remaining tissues or may indicate additional sites of Hesx1 expression in the embryo.
The
expression pattern of Hesx1 in pre-implantation and early
post-implantation murine embryos was not analyzed in this study.
However, RNase protection analysis showed that Hesx1 was
expressed at highest levels in undifferentiated ES cells and was
down-regulated during ES cell differentiation. This is a relatively
unusual expression pattern for a homeobox gene during ES cell
differentiation in vitro since most homeobox genes that have
been identified are either up-regulated during stem cell
differentiation or unaffected by
differentiation(39, 40) . This presumably reflects the
isolation of these genes from terminally differentiated cell types or
post-implantation embryos. The expression pattern of Hesx1,
and related expression patterns such as Oct-4(41) , Oct-6(42) , Hex(43) , Hesx4 and MmoxB(44) , indicates the
existence of a distinct group of homeobox genes that are down-regulated
during pluripotent stem cell differentiation in vitro. These
genes may play important roles in the early decisions of stem cell
renewal, differentiation, and lineage determination in the early
embryo.
The restricted expression of Hesx1, and the established role of homeobox genes in the determination of regional identity, indicate that this gene may be of importance in murine embryogenesis through the regulation of developmental decisions during early embryogenesis and fetal hematopoiesis. Resolution of the biological action of Hesx1 awaits detailed analysis of the sites of Hesx1 expression in the embryo and functional genetic analyses in the mouse.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) X80040[GenBank].