Department of Cell Biology, The Scripps Research Institute, La Jolla, California 92037
We investigated the requirements for targeting the centromeric histone H3 homologue CENP-A for assembly at centromeres in human cells by transfection of epitope-tagged CENP-A derivatives into HeLa cells. Centromeric targeting is driven solely by the conserved histone fold domain of CENP-A. Using the crystal structure of histone H3 as a guide, a series of CENPA/histone H3 chimeras was constructed to test the role of discrete structural elements of the histone fold domain. Three elements were identified that are necessary for efficient targeting to centromeres. Two correspond to contact sites between histone H3 and nucleosomal DNA. The third maps to a homotypic H3-H3 interaction site important for assembly of the (H3/H4)2 heterotetramer. Immunoprecipitation confirms that CENP-A self-associates in vivo. In addition, targeting requires that CENP-A expression is uncoupled from histone H3 synthesis during S phase. CENP-A mRNA accumulates later in the cell cycle than histone H3, peaking in G2. Isolation of the gene for human CENP-A revealed a regulatory motif in the promoter region that directs the late S/G2 expression of other cell cycle-dependent transcripts such as cdc2, cdc25C, and cyclin A. Our data suggest a mechanism for molecular recognition of centromeric DNA at the nucleosomal level mediated by a cooperative series of differentiated CENP-A-DNA contact sites arrayed across the surface of a CENP-A nucleosome and a distinctive assembly pathway occurring late in the cell cycle.
The accurate transmission of replicated eukaryotic
chromosomes is mediated by centromeres. Structurally distinct loci present once per chromosome, centromeres provide the essential functions of chromosome
segregation. These include specifying the assembly of the
kinetochore, a microtubule-dependent motor complex at
the surface of the chromosome, and the maintenance of
sister chromatid cohesion until their separation at the onset of anaphase (Bloom, 1993 By elegant molecular genetic approaches, it has been
possible to identify discrete cis-acting DNA sequences from
Saccharomyces cerevisiae (Clarke and Carbon, 1980 Centromere function is also established through essential interactions that take place at the level of DNA structure within chromatin. From the earliest cytological observations of centromeres as the primary constriction of
mitotic chromosomes, it has been understood that centromeres are packaged distinctly as constitutive heterochromatin. In the point centromeres of budding yeast,
150-200 bp of cen DNA sequences are packaged in a core
particle flanked on both sides by arrays of highly phased
nucleosomes (Bloom and Carbon, 1982 Understanding what constitutes a functional centromere
sequence in animal cells has been confounded by their
large size, ranging from 500-5,000 kb in human chromosomes (Tyler-Smith and Willard, 1993 One protein situated to play a role in specifying the
properties of centromeric chromatin is CENP-A, a centromere-specific homologue of the core nucleosomal protein histone H3 (Palmer et al., 1991 In this work we have dissected the molecular features of
CENP-A that are required for its assembly at human centromeres. By systematic replacement of structural elements of the CENP-A histone fold domain with the corresponding sequences of histone H3, we have identified
three regions of the molecule that are required for targeting CENP-A to centromeres. These correspond to two nucleosomal DNA contact sites of histone H3 and a region
that mediates self-association between the two copies of
histone H3 within the nucleosome. In addition to these
structural features, we show that CENP-A expression is
uncoupled from normal histone H3 expression, occurring
later in the cell cycle, and that this synthetic timing is important for appropriate targeting of CENP-A. Taken together, these data suggest a mechanism for specific molecular recognition of centromeric DNA at the level of the
nucleosome.
Cell Culture and Transfection
HeLa (ATCC CCL3) and tTA-HeLa cells (Gossen and Bujard, 1992 To establish a stable, inducible cell line expressing CENP-A-HA1,
the CENP-A insert from pcDL-CAepi was subcloned into plasmid pUHD10.3 (Gossen and Bujard, 1992 DNA Constructs
Construction of Segmental Mutants.
General methods were essentially as
described by Ausubel et al. (1995) W86 Mutants.
Codon 86 was randomized by the same bimolecular recombinant PCR strategy described above, using primers possessing the sequence NN(C/T) on the coding strand. Approximately 50 pCRII transformants were picked and colony sequenced with the CircumVent thermal
cycle sequencing kit (New England Biolabs Inc., Beverly, MA) using a 32P-end labeled primer that spanned nucleotides 275-293 in the CENP-A cDNA to determine the sequence of codon 86. 13 different mutants were
recovered (Y, F, I, L, V, C, N, D, H, R, S , P, and G) and cloned into
pcDL-CAepi as above. Constructs that failed to localize were sequenced
completely to verify that loss of function was due to mutation at codon 86 (Sequenase 2.0; United States Biochemical Corp.).
Transfer of CENP-A Helix II to Histone H3.
A trimolecular recombinant PCR strategy was used to replace helix II codons 85-112 of histone
H3 with the corresponding codons 87-114 of CENP-A. Three fragments
were generated in the first round of PCR. Fragment 1, the 5; Miyazaki and Orr-Weaver,
1994
; Pluta et al., 1995
). In addition to these primarily mechanical functions, centromeres act as important regulators of mitosis and meiosis through a mechanism that
monitors attachment of chromosomes to the spindle and
reports to a spindle assembly checkpoint that regulates
progression into anaphase (McIntosh, 1991
; Li and Nicklas, 1995
; Nicklas et al., 1995
; Rieder et al., 1995
). Understanding how these functions are specified at a molecular level begins with identification of the molecular recognition events that initiate centromere assembly on the chromosome.
; Hieter
et al., 1985
) and Saccharomyces pombe (Hahnenberger et
al., 1989
; Niwa et al., 1989
) that are sufficient to establish
centromere function on artificial chromosomes. Dissection
of these sequences has revealed that centromere function
is established at both the primary structural level of DNA
sequence as well as at higher levels of DNA structure
within chromatin. DNA sequence recognition is driven by
sequence-specific DNA-protein interactions, exemplified
by the essential CDE III element of the S. cerevisiae centromere; single point mutations in this 25-bp DNA sequence can completely abolish centromere activity (McGrew et al., 1986
; Hegemann et al., 1988
). CDE III plays a primary role in kinetochore assembly on the S. cerevisiae
centromere by binding to a 240-kD multiprotein complex,
CBF3, that mediates the association of a microtubule-
dependent motor activity with the chromosome (Lechner
and Carbon, 1991
; Hyman et al., 1992
; Middleton and Carbon, 1994
). Cbf3b, a 60-kD subunit of CBF3, is an essential zinc finger protein that is thought to provide the DNA
binding function of CBF3 (Lechner, 1994
). Other examples of centromere proteins that directly recognize DNA
sequence are the yeast helix-loop-helix protein CBF1 (Cai
and Davis, 1990
) and the mammalian protein CENP-B,
which recognizes a discrete sequence element found in
centromeric satellite DNA (Earnshaw et al., 1987
; Masumoto et al., 1989
; Sullivan and Glass, 1991
). Thus, molecular recognition of centromeric loci occurs, at least in part,
through direct DNA sequence recognition by proteins, interactions similar to the familiar DNA binding activities
observed for transcription factors (Mitchell and Tjian,
1989
; Harrison, 1991
).
), and this specialized chromatin structure is necessary for centromere function (Saunders et al., 1988
; Bloom et al., 1989
). Sequence
element CDE II, which comprises a 78-86-bp AT-rich segment conserved in composition but not in sequence among
yeast centromeres, appears to adopt a uniquely folded
conformation that plays an important role in providing
complete centromere function (Sorger et al., 1994
; Tal et al.,
1994
; Sears et al., 1995
). The more complex centromeres
of fission yeast exhibit a different type of chromatin structure, with several kilobases of DNA in the central core domain packaged in a highly irregular nucleosomal array
that is assembled only in conjunction with functional centromere sequences in S. pombe (Polizzi and Clarke, 1991
).
The dependence of this structure on sequences distal to
the central core DNA suggests that large scale folding of
the centromere locus is required for the segregation function (Polizzi and Clarke, 1991
; Marschall and Clarke,
1995
).
). Nevertheless, it
has been possible to map a Drosophila centromere to a
420-kb segment, revealing that both simple satellite sequences as well as islands of complex sequence are required for complete centromere function (Le et al., 1995
;
Murphy and Karpen, 1995
). In mammalian cells, centromere function is also associated with large blocks of
heterochromatin comprised of highly repetitive satellite
DNA typified by the
satellite of primate chromosomes:
extensive tandemly repeated arrays of a 171-bp monomer sequence (Willard, 1991
). One of the abiding mysteries of
animal centromeres, however, is the lack of sequence conservation of centromeric satellite DNA (Beridze, 1982
).
With the exception of a small sequence that functions as
the binding site for CENP-B, the CENP-B box (Masumoto et al., 1989
), no homology is seen in satellite DNA
across different classes of vertebrates, and, indeed, satellite DNA is one of the most rapidly evolving compartments of the genome in vertebrates. An important role for
CENP-B and the CENP-B box in centromere function is
in doubt, however, since its presence is not correlated with
centromere function (Earnshaw et al., 1989
). Two hypotheses have been suggested to explain this lack of conservation in centromere sequences: either animal cell centromere DNA contains small, as yet unidentified sequence
elements similar to yeast centromeres that possess kinetochore-nucleating capabilities, or centromere function is not
specified directly by DNA sequence, but rather by higher
order DNA or chromatin structure.
; Sullivan et al., 1994
).
CENP-A was originally identified as a centromere-specific
autoantigen that copurified with nucleosomal core particles (Earnshaw and Rothfield, 1985
; Palmer and Margolis, 1985
; Palmer et al., 1987
). A potentially homologous protein has recently been identified in yeast as the product of
a gene, CSE4, that is essential for mitotic chromosome segregation (Stoler et al., 1995
). Together, CENP-A, CSE4, and
histone H3 form a roughly equidistant triangle of homologous proteins linked at the level of ~60% sequence identity, limited to a COOH-terminal domain of ~90 amino
acids (Sullivan et al., 1994
; Stoler et al., 1995
). This region
corresponds to the domain in histone H3 that is sufficient for nucleosome assembly in vitro (for review see van
Holde, 1989) and in vivo (Mann and Grunstein, 1992
), and
that is part of the highly ordered core of the histone octamer (Arents et al., 1991
). Surprisingly, this conserved histone fold domain of CENP-A is required for targeting to
human centromeres, rather than the unique sequences of
the NH2 terminus (Sullivan et al., 1994
).
Materials and Methods
) were
maintained in DME with 10% FCS (GIBCO BRL, Gaithersberg, MD) at
37°C in a 5% CO2 atmosphere. tTA-HeLa cultures were supplemented with 400 µg/ml G418. Stably transformed tTA cell lines (see below) were cultured in the presence of 400 µg/ml G418, 330 ng/ml puromycin, and 1 µg/ml
tetracycline. For immunofluorescence experiments, cells were plated on
glass coverslips at a density of 2-2.5 × 104 cells per cm2 the night before
transfection. Transfection was performed in serum-free medium using Lipofectamine (GIBCO BRL) as previously described (Sullivan et al., 1994
).
), forming plasmid pUHD10.3CAepi. tTA HeLa cells expressing the tetracycline transactivator were cotransfected with pUHD10.3-CAepi and pBS-PAC, a puromycin resistance marker (de la Luna et al., 1988) at a 10:1 ratio using Lipofectamine. Transformants were selected using 330 ng/ml puromycin in the presence of 1 µg/ml tetracycline, and individual clones were assayed by induction in
tetracycline-free medium followed by Western blot analysis.
unless specified. Mutations were constructed in plasmid pcDL-CAepi, which is identical to pcDL-CAHA1
(Sullivan et al., 1994
) except that three copies of the hemagglutinin (HA)1 1 epitope are present at the COOH terminus of the coding region. Histone
H3 sequences were obtained from plasmid pMH3.2-614 (Taylor et al.,
1986
). Segments of CENP-A were replaced with the corresponding H3 sequence using a bimolecular recombinant PCR strategy. A pair of standard
5
and 3
oligonucleotide primers (GIBCO BRL) flanking the CENP-A
coding region of pcDL-CAepi was prepared and used in all experiments.
For each mutant, two divergent overlapping primers were constructed,
each containing at least 12-15 bp of CENP-A sequence at their 3
ends
and a segment encoding the desired mutations at their 5
ends. Each mutagenic oligonucleotide pair was designed with a 15-17-bp overlap. PCR
reactions were performed using each mutagenic primer in conjunction
with the appropriate flanking primers, synthesizing two DNA fragments
that overlapped by 15-17 bp within the mutated region. PCR reactions
(95°C × 90 s; 20 × [95°C × 30 s, 55°C × 60 s, 72°C x 90 s]; 72°C × 10 min)
were performed in 50 µl using 5 µg/ml pcDL-CAepi, 1.5 mM Mg, 1 µM
each primer, 100 mM dNTPs (Pharmacia Fine Chemicals, Piscataway, NJ),
10% DMSO, and 1.25 U of an 8:1 unit ratio mixture of Taq DNA polymerase (Promega, Madison, WI) and Pfu DNA polymerase (Stratagene,
La Jolla, CA) made fresh for each experiment. PCR products were purified via QX Matrix (Qiagen, Chatsworth, CA), combined, and used as template for a second round of PCR using only the standard primers, performed as above except with 10 amplification cycles. Full-length recombinant PCR products were cloned into plasmid pCRII (Invitrogen, San Diego, CA). The sequence of the entire coding region was verified (Sequenase 2.0; United States Biochemical Corp., Cleveland, OH), and
inserts from correct clones were isolated as NarI-SacI fragments and
cloned into NarI- and SacI-digested pcDL-CAepi. Plasmids for transfections were prepared with Qiagen DNA purification columns.
fragment,
containing codons 1-84 of histone H3, was amplified from plasmid pcDLH3HA1 using the standard 5
primer as above and a 30-mer primer at the
3
end corresponding to codons 80-84 of histone H3 plus codons 87-91 of
CENP-A (the insertion of two codons in CENP-A relative to histone H3
accounts for the difference in residue numbering). Fragment 2, the central
fragment corresponding to helix II of CENP-A, was amplified from pcDLCAHA1 using a 5
primer complementary to the 3
primer of fragment 1 and a primer at the 3
end corresponding to the last five codons of CENP-A
helix II and codons 113-117 of histone H3. Fragment 3, the 3
fragment
encoding the COOH-terminal portion of histone H3 and the HA-1 epitope, was amplified from pcDL-H3HA1 using a 5
primer corresponding
to the last five codons of CENP-A helix II and codons 113-117 of histone
H3 and the standard 3
primer an oligo in the 3
untranslated region of
CENP-A. The three fragments were purified, and then combined in
equimolar amounts to provide the template for a second round of PCR using the standard 5
and 3
primers; the product was isolated and subcloned for expression as described above. The complete coding region sequence
of the resulting plasmid was verified by sequencing.
end and a single copy of the HA-1
epitope followed by an AflIII site at the 3
end. The amplified product was
cloned into NcoI-AflIII-digested pMH3.2-614 and verified by DNA sequencing.
Immunofluorescence
For analysis of protein localization in transfected cells, immunofluorescence microscopy was performed 18-72 h after transfection, as described
previously (Sullivan et al., 1994). Endogenous centromere antigens were
visualized with a human anticentromere antiserum, hACA-M detected
with a rhodamine-coupled secondary antibody, while HA epitope-tagged
proteins were visualized with mAb 12CA5 (a kind gift from Dr. Ian Wilson, The Scripps Research Institute, La Jolla, CA) and fluorescein-coupled secondary antibody.
Immunochemistry
Immunoblots were performed as described previously (Sullivan et al., 1994)
using human anticentromere serum hACA-M at a dilution of 1:2,000 and
mAb 12CA5 at a concentration of 5 µg/ml. Blots were developed using
HRP-coupled secondary antibodies (Amersham Corp., Arlington Heights,
IL) and a chemiluminescence detection reagent (Pierce Chemical Co.,
Rockford, IL).
For immunoprecipitation analysis, protein expression in a stable
pUHD10.3-CAepi transformant was induced for 3 d. Nuclei from 3-5 × 107 cells were isolated according to Masumoto et al. (1989), washed in
buffer A ( 5 mM Hepes, pH 7.5, 10 µM leupeptin, 1.5 µM aprotinin, 1 mM
DTT), and centrifuged at 3,000 g. The nuclear pellet was resuspended in
500 µl digestion buffer at a concentration of 0.5-1 × 108/ml (buffer A containing 200 U/ml micrococcal nuclease, 1 mM CaCl2) and incubated at
37°C for 5 min. Digestion was stopped by addition of EDTA to a final
concentration of 10 mM. After centrifugation at 8,000 g, the supernatant
was collected, and the pellet was resuspended in buffer A and subjected to
two additional rounds of extraction by sonication for 10 s followed by centrifugation and collection of the supernatants. Supernatants were pooled
in a siliconized Eppendorf tube, supplemented with 0.1% NP-40 and 25 µg
of mAb 12CA5, and mixed end over end for 2 h at 4°C. A 100-µl aliquot
of protein A-Sepharose (Pharmacia Fine Chemicals) previously equilibrated with buffer A was added and incubated for an additional 2 h at 4°C.
The beads were collected by centrifugation and the supernatant was
saved. Immunoprecipitates were washed five times with buffer A, and
then resuspended in SDS-PAGE sample buffer. Equivalent amounts of all
soluble fractions and one half of the immunoprecipitated proteins were
analyzed by Western blotting.
Isolation of a Human CENP-A Genomic Sequence
A human Caucasian male placental genomic DNA library prepared in
Lambda Fix II (Stratagene; a kind gift from Edward Chan, The Scripps
Research Institute) was screened by PCR (Israel, 1993) using CENP-A
primers that span a small intron. Two phage with overlapping inserts
spanning 20 kb of genomic DNA were isolated and characterized by restriction mapping using a series of probes derived from the CENP-A cDNA
(to be described in detail elsewhere). A 2,878-bp EcoRI fragment containing a 5
flanking genomic sequence was isolated and sequenced by a combination of manual and automated methods (GenBank accession number
U82609). The 2.9-kb fragment was found to contain 1,101 bp upstream of
the start of our CENP-A cDNA clone, the first 250 bp of the cDNA and
1,527 bp of the first intron in CENP-A.
Cell Cycle Analysis
HeLa cells were grown in 10-cm dishes to ~60% confluence. The first
block was initiated by replacing medium with complete DME containing
2 mM thymidine. After 15 h, cells were released by washing twice with
dPBS and adding normal complete DME, and were allowed to grow for 9 h.
Cells were blocked a second time for 15 h as above. After release as
above, samples were collected at 2 h intervals for 16 h by trypsinization
and washed twice with PBS, and pellets were kept at 70°C until preparation of RNA. For time points exhibiting an increased mitotic index (8-12 h after release), cells were also recovered from the media and the washes
before trypsinization.
RNA was isolated by acidic guanidinium thiocyanate/phenol-chloroform extraction (Xie and Rothblum, 1991). CENP-A mRNA was assayed
by RNase protection using a probe constructed by cloning a 155-bp EcoR1-
ApaI fragment containing the 5
end of the CENP-A cDNA into pBSSK(+) (Stratagene). Plasmid was linearized with XbaI for transcription by
T7 RNA polymerase (Maxiscript kit; Ambion, Austin, TX) and
-[32P]UTP
(Amersham Corp.) according to the manufacturer's instructions. The
probe length was 203 bp with a protected fragment length of 153 bp.
RNase protection asays (HybSpeed RPA kit; Ambion) were performed
using 10 µg of total HeLa RNA isolated from synchronized cells. Hybridization of probe (350K cpm/rxn) and RNA was carried out for 1 h at 68°C in
siliconized tubes followed by digestion with an RNase A/T1 mixture used
at a dilution of 1:100 from the supplied concentration. End-labeled size
markers were prepared from an HaeIII digest of pBS-SK(+). Reactions
were electrophoresed on 6% sequencing gels and exposed for 2 h on a
phosphor imaging screen from Molecular Dynamics (Sunnyvale, CA). ImageQuant software (Molecular Dynamics) was used to quantitate signal
intensities. Histone H3 mRNA abundance was determined in the same
samples by Northern blot analysis, using the coding region of plasmid
pMH3.2-614 as a probe, and similarly quantitated.
Structural Determinants of Centromeric Targeting
The histone fold domain consists of a set of three helices
(H I, H II, H III) separated by two turn/
sheet structures
(strand A, strand B); histone H3 and, by homology,
CENP-A contain an additional
helix at the NH2 terminus of the fold domain (N helix; Fig. 1 A) (Arents et al.,
1991
). To evaluate CENP-A targeting within the context
of this structure, we prepared a set of substitution derivatives by replacing CENP-A sequences within the fold with the homologous histone H3 sequences (Fig. 1 A). Mutations were constructed using an epitope-tagged version of
CENP-A carrying three copies of the influenza hemagglutinin HA-1 epitope (Wilson et al., 1984
) at the COOH terminus, allowing us to monitor the expression (Fig. 1 B)
and localization of CENP-A derivatives in transfected
cells (Fig. 2; WT).
We first asked whether the histone fold domain is sufficient to direct centromeric targeting. In previous experiments, the NH2-terminal tail of CENP-A was replaced with
that of histone H3, which, although lacking sequence homology, shares its highly basic character with CENP-A
(Sullivan et al., 1994). To determine if a basic NH2-terminal tail is dispensable for targeting, codons 4-31 of CENP-A
were excised. The resulting protein showed no impairment of targeting to centromeres, demonstrating that a basic
NH2-terminal tail is dispensable for this function (Fig. 2;
N
). Thus, the COOH-terminal portion of CENP-A, corresponding to the histone fold homology domain, is both
necessary and sufficient for assembly of CENP-A at centromeres.
Within the histone fold domain, we initially examined
four regions corresponding to secondary structure segments of the domain based on the data of Arents et al.
(1991) and Richmond et al. (1993)
: helices I and II, strand
A, and strand B (Fig. 1 A). We also tested the COOH terminus, which is longer in CENP-A and divergent from histone H3. Helix III was not tested since only a single conservative (Ile-Val) substitution is found in this segment of CENP-A. The two most conserved regions, helix I (Fig. 2;
hI) and strand B (Fig. 2; sB), could be substituted without
affecting targeting, as could the COOH terminus (Fig. 2;
C). The strand A substitution, residues 75-86, was profoundly deficient in targeting ability (Fig. 2; sA). It retained a small degree of targeting specificity that was observed as a slight increase in centromere staining over an
essentially uniform nuclear incorporation in a minority of
cells (see also Fig. 3 D below). Substitution of helix II of
the histone fold domain resulted in a complete loss of targeting to centromeres (Fig. 2; hII). These data demonstrate that sequences in the central portion of the histone
fold domain are primarily responsible for targeting CENP-A
to centromeres.
The two segments identified in this experiment comprise a large contiguous region at the center of the domain and contain most of the divergent amino acids that distinguish CENP-A from histone H3. To further refine identification of targeting sequences, strand A and helix II were each divided into NH2-terminal, central, and COOH-terminal portions, each containing three to five CENP-A specific residues that were substituted with histone H3 sequences as above (Fig. 1 A). This analysis showed the NH2 and COOH termini of the long central helix II were necessary for targeting CENP-A, but replacement of residues in the central portion of the helix had no effect (Fig. 3 B). In contrast, none of the strand A subregion mutations, including the deletion of two amino acids in the center of this region, showed any significant impairment of targeting (Fig. 3 A). Thus, the CENP-A-specific sequences at the two ends of helix II, residues 88-92 and 109-114, are necessary for assembly at centromeres, while strand A, residues 75-86, can accommodate significant change in amino acid sequence and length without disruption of targeting activity.
One residue in this region, Trp86, was selected for specific mutagenesis. This residue is notable because Trp is
absent in the core histones, but is present at this same position in CSE4 (Stoler et al., 1995). This codon was randomized with a PCR procedure, and 13 mutants encoding different amino acids were recovered and tested for targeting
(data not shown). Replacement of Trp86 with the aromatic
residues Tyr or Phe (the amino acid normally found in this
position in histone H3) had no effect on targeting. Aliphatic residues showed intermediate levels of targeting
roughly correlated with hydrophobicity, while hydrophilic
and charged residues failed to target at all. This experiment rules out a specific role for this Trp residue in centromeric targeting, but demonstrates a requirement for an aromatic amino acid.
Histone H3 contains an additional helix at the amino
terminus of the histone fold domain, the N-helix. Secondary structure prediction reveals a putative
helix in this
segment of CENP-A, spanning residues 43-55. When this
region was replaced along with the entire NH2-terminal
tail of histone H3, the resulting protein, H3-CA, retained
targeting activity but was less efficient at localizing to centromeres (Sullivan et al., 1994
). Two additional replacement mutants were constructed, HN1 and HN2, to ask
whether a secondary targeting element could be identified
in this region (Fig. 1 A). Mutant HN1, replacing the NH2terminal portion of this helix, had a distribution similar to
H3-CA, with localization at centromeres detected over
variable levels of nonspecific nuclear staining (Fig. 3, C
and D). Mutant HN2, spanning the COOH-terminal portion of the N-helix, targeted normally.
Quantitative assessment of the relative roles of the different targeting elements of CENP-A is complicated by the fact that levels of gene expression vary considerably within the transiently transfected cell population. Even for wild-type CENP-A-HA1, a substantial fraction of cells was observed in which overexpression results in uniform nuclear staining. To compare the targeting defects of the strand A and HN1 mutations, we assayed populations of transfected cells, judging the distribution of epitopetagged CENP-A as being primarily localized at centromeres (e.g., Fig. 2; WT and C), detectably localized (ranging from Fig. 2; sA, to Fig. 3 C), or unlocalized (e.g., Fig. 2, hII). Data are presented in histogram form in Fig. 3 D. For two control constructs assayed simultaneously, the majority of cells had primarily localized epitope with the remaining cells approximately evenly distributed in the other two classes (Fig. 3 D; WT and HC). For the N-helix mutant, only a small proportion of cells (6%) exhibited primarily localized mutant protein, while most cells (49%) showed detectably localized centromeric CENP-A over varying levels of general nuclear staining (Fig 3 D; HN1). For the strand A mutant, no cells were observed with staining primarily at centromeres, and only 18% showed any detectable targeting above the general nuclear staining (Fig. 3 D; HSA). These results suggest that the predicted N-helix of CENP-A contains sequences that are required for efficient targeting to centromeres but to a lesser extent than sequences of strand A or helix II.
Since the long central helix II was the only region that was absolutely required for targeting to centromeres, we sought to determine whether it could act by itself to direct histone H3 preferentially to centromeres. A derivative of histone H3 was constructed by replacing residues 85-112 of histone H3 with the corresponding residues of CENP-A (87-114). The resulting protein showed no ability to localize to centromeres, not even at the level of the strand A mutant of CENP-A (data not shown). Thus, while helix II sequences are required for targeting CENP-A to centromeres, they function only in conjunction with other components of CENP-A.
Self-association of CENP-A Predicts Formation of Homotypic Nucleosomes
The COOH-terminal segment of histone H3 helix II provides a unique function within the nucleosome, mediating
protein-protein association at the dyad axis that links the
two symmetric halves of the nucleosome (Camerini-Otero
and Felsenfeld, 1977; Xie et al., 1996
). A requirement of
this sequence for targeting CENP-A was revealed by mutant HH2.3 (Fig 3 B), suggesting that protein-protein interactions within the nucleosome are important for
CENP-A function. To ask whether CENP-A exhibits selfassociation properties, we constructed a stable HeLa cell
line that inducibly expresses the epitope-tagged CENP-A
derivative, CENP-A-HA1. Upon induction, cells accumulate CENP-A-HA1 (Fig. 4 A) at their centromeres (data not shown), allowing us to assay protein interactions under
conditions in which CENP-A was primarily localized at
centromeres. Chromatin was solubilized from isolated nuclei by micrococcal nuclease digestion followed by brief
sonication to release centromeric chromatin. After immunoprecipitation from this soluble chromatin extract using
mAb 12CA5, fractions were analyzed by SDS-PAGE and
immunoblot analysis using human anti-centromere antibodies, allowing detection of both epitope-tagged and endogenous CENP-A (Fig. 4, B and C). Under conditions in
which CENP-A-HA1 was present at a lower abundance than endogenous CENP-A, the immunoprecipitated fraction always contained equimolar quantities of endogenous
CENP-A recovered with CENP-A-HA1, as judged by Western blot signal intensity (Fig. 4 B). When CENP-A-HA1
was overexpressed relative to endogenous CENP-A, the
endogenous protein was still recovered in immunoprecipitates, but in diminished quantities (Fig. 4 C). These data
provide strong evidence for CENP-A self-association in
vivo. A nucleosome containing CENP-A could in principle
be either heterotypic, containing one copy each of CENP-A
and histone H3, or homotypic with two copies of CENP-A.
Recovery of endogenous CENP-A in the presence of a vast amount of the potential competitor histone H3 demonstrates a preference for self-association. The presence of
equimolar amounts of endogenous and epitope-tagged
proteins under the conditions of Fig. 4 B, where the quantitatively minor CENP-A-HA1 is essentially doping the
CENP-A pool, indicates that this association is highly efficient
essentially all CENP-A-HA1 is present in an equimolar complex. Competition by CENP-A-HA1 when it is
quantitatively overexpressed, as in Fig. 4 C, is further evidence for efficient CENP-A/CENP-A self-association. We
conclude that CENP-A nucleosomes are homotypic for
CENP-A.
Regulatory Determinants of Centromeric Targeting
The assembly of normal histone H3 into chromatin takes
place concurrently with DNA replication, as histone H3/H4
tetramers are deposited on newly synthesized DNA within
minutes (Worcel et al., 1978). For our initial experiments,
we reasoned that expression of CENP-A during S phase
would be appropriate, and we prepared a construct in
which the coding region of CENP-A was placed under the
regulatory signals of a mouse S-phase-dependent histone
H3 gene (Taylor et al., 1986
; Harris et al., 1991
) (Fig. 5).
Surprisingly, even at low levels of expression, CENP-A
synthesized from this plasmid failed to accumulate at
centromeres but was distributed throughout the nucleus
(Fig. 5). Since we observe targeting in cells that express
CENP-A-HA1 constitutively, we interpret these results to
show that uncoupling CENP-A expression from normal histone expression in S phase is an important component
of the CENP-A targeting mechanism.
The cell cycle-dependent expression of CENP-A was
examined directly in cells synchronized at the G1/S boundary using a double thymidine block procedure. CENP-A
mRNA was detected using an RNase protection assay
(Fig. 6 A) while histone H3 transcripts were detected by
Northern blot analysis (Fig. 6 B). HeLa cells released from
a block at the G1/S boundary take 7 h to complete S phase, and spend 3.5 h in G2 and ~1 h in mitosis (Rao and
Johnson, 1970). A plot of the relative abundance of each
transcript as a function of time after release (Fig. 6 C)
showed that accumulation of histone H3 mRNA paralleled previously published analyses, peaking in mid S phase
4-5 h after release from the thymidine block, followed by a
rapid decline to baseline levels by 8-10 h (Harris et al., 1991
). In contrast, CENP-A mRNA accumulation did not
begin until mid S phase and reached maximal levels 8-10 h
after release. CENP-A mRNA levels also rapidly declined
between 10 and 12 h after release. CENP-A protein was
assayed in parallel by Western blot analysis using a human
autoantiserum, showing a gradual increase in abundance starting 4-6 h after release, consistent with an approximate
doubling of the CENP-A pool (data not shown).
The pattern of mRNA accumulation observed for
CENP-A is similar to that of several cell cycle related gene
products, including cdc2, cdc25C, and cyclin A (Dalton,
1992; Zwicker et al., 1995
). A common repressor-mediated
transcriptional control mechanism has recently been identified among these three cell cycle-regulated genes, conferred by a conserved DNA sequence motif that spans 15 bp
located within 20 nucleotides 5
of the transcription start site (Lucibello et al., 1995
; Zwicker et al., 1995
). This element contains two conserved segments, seven and five nucleotides in length, separated by a 3-bp linker of unconserved sequence (Fig. 6 D). A genomic clone for human
CENP-A was isolated, and a 2.9-kb fragment containing
the first exon and 1.1 kb of 5
flanking genomic DNA was
subjected to DNA sequence analysis. A sequence nearly identical to the cell cycle repressor motif was found 11 bp
upstream of the 5
end of the CENP-A cDNA (Fig. 6 D).
In CENP-A, the two conserved elements of the motif
shared 100% identity with the cell cycle repressor motif.
Curiously, these were separated by 8 bp rather than 3 bp,
precisely an additional half helical turn of the DNA, as
compared with cdc2, cdc25C, and cyclin A. Nevertheless, coupled with the observation that CENP-A mRNA accumulates with a similar kinetic pattern during the cell cycle,
it is reasonable to propose that this motif is involved in
linking CENP-A gene activity to the cell cycle. Taken together, these results strongly suggest that expression late
in the cell cycle is necessary for proper assembly of CENP-A
at centromeres.
Three pieces of evidence suggest that CENP-A acts as a
core histone, replacing histone H3 within the histone octamer. The first is the biochemical demonstration that
CENP-A copurifies with nucleosomes and with the histone H3/H4 tetramer during fractionation of chromatin
(Palmer and Margolis, 1985; Palmer et al., 1987
). The second is the high degree of amino acid sequence homology shared by CENP-A and histone H3, specifically within the
histone fold domain (Palmer et al., 1991
; Sullivan et al.,
1994
). Finally, association with chromatin and a genetic interaction with normal histone H4 suggest that CSE4, the
putative S. cerevisiae homologue of CENP-A, is a nucleosomal protein (Stoler et al., 1995
; Smith et al., 1996
). From
these considerations, it is logical to consider the overall
organization of CENP-A as similar to that of histone H3
within the histone octamer (Arents et al., 1991
; Arents and Moudrianakis, 1993
; Richmond et al., 1993
).
Structural Basis for CENP-A Assembly at Centromeres
At a structural level, the ability of CENP-A to assemble
into centromeric chromatin is specified solely by the histone fold domain. As with the other core histones, histone
H3 makes several contacts with nucleosomal DNA as it
winds over the surface of the histone octamer (Mirzabekov et al., 1978; Shick et al., 1980
; Richmond et al., 1984
,
1993; Hill and Thomas, 1990
; Arents and Moudrianakis,
1993
). Two of the three targeting elements of CENP-A correspond to histone H3-DNA contact sites. The first of
these is near the site where DNA enters and exits the octamer, corresponding to the position of the N-helix (Fig.
7 A, peach), which acts as a weak targeting element in our
experiments (Richmond et al., 1984
, 1993; Hill and Thomas,
1990
). A second major H3-DNA contact takes place at the
position of strand A and the NH2 terminus of helix II, one
of the most concentrated sites of divergence between CENP-A and histone H3 (Mirzabekov et al., 1978
; Shick
et al., 1980
; Richmond et al., 1984
, 1993). These sequences
form a fairly broad strip on the surface of the nucleosome
lying directly across the DNA path (Fig. 7 A, yellow and
tan). Strand A is a part of a parallel
sheet structure that has
been proposed to act as a specific DNA binding element
of the histone octamer (Arents and Moudrianakis, 1993
),
while the NH2 terminus of helix II is directly adjacent to
this region and exposed on the surface. Since small substitutions in strand A had no significant effect on centromeric targeting, it is unlikely that specific side-chain interactions with DNA are required for this region's contribution
to the targeting function. Rather, it may act by imparting
some general structural features to this portion of the core
particle, perhaps influencing the structure of the NH2 terminus of helix II.
A third region of CENP-A that is necessary for targeting is the COOH-terminal portion of the long central helix
II (Fig. 7, orange). This region, largely buried in the interior of the H3/H4 tetramer, forms an important protein-
protein interaction between the two copies of histone H3,
directly on the dyad axis of the nucleosome (Camerini-
Otero and Felsenfeld, 1977). This is the only homotypic interchain interaction that can be detected by contact site
cross-linking experiments with nucleosome core particles, indicating that the H3/H4 tetramer is held together primarily by this H3-H3 interaction (for review see van Holde,
1989). The role of this region in mediating protein-protein
interaction is quite apparent in the structure of the
dTAFII62/dTAFII42 heterotetramer, a component of TFIID
whose structure is strikingly similar to the heterotetrameric histone H3/H4 core of the nucleosome (Xie et al.,
1996). In this structure, two molecules of dTAFII42, the histone H3 homologue, make an extensive contact at the
COOH terminus of helix II, which links the two symmetric
halves of the heterotetramer. In CENP-A, this region,
109-AYLLTL114, presents more hydrophobic and bulky
side chains than the corresponding region of histone H3,
107-TNLCAI112. Thus, this element is situated to affect the protein-protein interactions across the dyad axis of a
CENP-A nucleosome, differentiating it from histone H3.
DNA Recognition by Specialized Nucleosomes: A Model
Taken together, these structural considerations suggest a model for the selective recognition of centromeric DNA by CENP-A driven by specialized DNA contact surfaces and self-association. We propose that the specific function of the COOH terminus of helix II is to promote CENP-A- CENP-A self-association, presumably in the context of a (CENP-A/H4)2 heterotetramer, to form a homotypic CENP-A nucleosome. A homotypic CENP-A nucleosome will possess a duplicated set of differentiated DNA contact sites arrayed across the nucleosome surface. The repetition and geometry of these sites provides the possibility for cooperative interaction of the specialized DNA binding surfaces of CENP-A nucleosome, allowing what individually may be only weakly selective binding sites to sum to a significant affinity for centromeric DNA sequence or structure (Fig. 7 B).
Two predictions of this model are that (a) CENP-A
should form homotypic nucleosomes, and (b) target DNA
should have a repeating substructure that matches the specialized surfaces of CENP-A. Experimental support for the
self-association of CENP-A has been obtained by coimmunoprecipitation of endogenous with transfected CENP-A
(Fig. 4). While we have not yet determined experimentally the DNA sequences or structures to which CENP-A is
bound, it is very likely that they include the satellite DNA
component of mammalian chromosomes. Satellite DNA is
unconserved at the level of primary sequence (Beridze,
1982). Theoretical analysis of satellite DNAs, however, reveals a substructure comprised of two 50-60-bp bending elements that are separated by 20-30 bp of low bending
potential that is conserved among satellites from numerous species (Fitzgerald et al., 1994
). Recognition of such
conserved structural features of DNA by CENP-A might
explain how centromere structure and function are conserved without apparent DNA sequence conservation.
The strong strand A-helix II targeting site corresponds to
a region where nucleosomal DNA is deformed, bending
more sharply across the protein surface than flanking regions (Fig. 7 B, arrows) (Richmond et al., 1984
; Wolffe,
1995
). DNA bending or curvature is known to be an important determinant of histone octamer positioning on
DNA (Shrader and Crothers, 1989
; Sivolob and Khrapunov, 1995
). Furthermore, analysis of the nucleosomal
positioning signal in the ribosomal 5S RNA gene suggests
that the regions 2-3 helical turns on either side of the dyad
axis, very close to the predicted site of interaction with
strand A, play a dominant role in specifying octamer position on the DNA (FitzGerald and Simpson, 1985
). DNA recognition by CENP-A may therefore be a specialized
implementation of general nucleosomal positioning features. These considerations support the notion that some
of the molecular recognition events that specify centromere formation in higher eukaryotes take place at the
level of DNA structure rather than DNA sequence, per se, and that these occur in the context of a specialized nucleosome.
Chromatin Assembly and the Specification of Centromeres
Structural recognition alone is not sufficient to explain the
specific localization of CENP-A to centromeres, since overexpression of CENP-A results in a distribution throughout
the nucleus. Thus, there does not appear to be an efficient
mechanism to degrade ectopically localized CENP-A as
is observed for CENP-C (Lanini and McKeon, 1995).
Rather, our evidence points to regulation of the timing of
CENP-A synthesis as an important feature of the targeting
mechanism. Restricting expression of CENP-A to S phase abolished targeting, and analysis of steady state CENP-A
mRNA abundance revealed that, indeed, it is uncoupled
from normal histone expression, beginning late in S phase
and extending through G2. Replication of centromeric
chromatin therefore occurs through a process that is at
least partially independent of normal chromatin replication. One reason for this may be simply to couple CENP-A synthesis with centromere DNA replication, which occurs
in mid to late S phase (O'Keefe et al., 1992
). A second
possibility is that the temporal offset is required to promote the assembly of homotypic CENP-A nucleosomes,
by expression at a time when concentrations of potentially
competitive histone H3 are diminished. A third explanation for the role of temporal segregated from bulk histone synthesis is that a unique replication pathway for centromeric chromatin is part of the process by which cells recognize and propagate centromeres as distinct functional
compartments of the chromosomes. Epigenetic features of
centromere structure and function have been identified
through analysis of position effect variegation in Drosophila (Spradling and Karpen, 1990
; Henikoff, 1992
) and of activation of deficient centromere sequences in S. pombe (Steiner and Clarke, 1994
). Understanding how
CENP-A chromatin replication is linked to the maintenance and function of centromeres on human chromosomes will provide new insight into the question of what constitutes an animal cell centromere.
Why Histone H3?
The heart of the nucleosome is the histone (H3-H4)2 heterotetramer. As discussed above, the heterotetramer possesses most of the DNA binding properties of the nucleosome as well as the information required for positioning
(FitzGerald and Simpson, 1985; Dong and van Holde, 1991;
Wolffe, 1995
) and is deposited first onto DNA after replication, followed by the slower addition of histone H2AH2B dimers (Worcel et al., 1978
). Histones H3 and H4
are thus uniquely situated to play a primary role in nucleosomal DNA recognition. Of all the four core histones,
only histone H3 has the opportunity to direct its own selfassembly through homotypic interactions (Camerini-Otero
and Felsenfeld, 1977
; Arents et al., 1991
; Xie et al., 1996
).
Homotypic H3-H3 interactions are therefore a key to
harnessing the cooperative binding potential inherent in
the dyad symmetry of the nucleosome. Additional histone H3 variants have been identified at the sequence
level in Caenorhabditis elegans (Gown et al., 1996
) and as
a mouse cDNA (GenBank accession number AA008158).
The mouse sequence contains a histone fold domain that is
only 80% identical to that of mammalian CENP-A and
may correspond to mouse CENP-A or represent yet another histone H3 homologue. Taken together, these observations reveal that histone H3 occupies a unique niche in
the structure of the nucleosome, one that may provide an
important element of adaptability for the structural differentiation of the chromatin fiber.
In summary, analysis of the histone fold domain structures of CENP-A that are required for its localization into centromeres reveals that this process depends upon the specialization of key elements of the histone H3 molecule: DNA binding surfaces and the unique H3-H3 homotypic dimer interface. Examining these features in the context of a histone octamer reveals how these elements can combine to provide modified DNA binding sites distributed in a cooperative array spanning ~120 bp of nucleosomal DNA. In addition to providing a framework for understanding how centromeric chromatin may be built upon a nucleosomal DNA recognition mechanism, these experiments focus on the unique aspects of histone H3 within the nucleosome. Thus, understanding the relationships between structure and function for the specialized centromeric CENP-A nucleosome may provide new insight into the functions that histone H3 provides for chromatin in general.
Received for publication 11 November 1996 and in revised form 27 November 1996.
1. Abbreviation used in this paper: HA, hemagglutinin.We thank H. Damke, S. Schmid, and H. Bujard for their gifts of tTA vectors and HeLa cell line, and E. Chan for the human genomic library.
This work was supported by National Institutes of Health grant GM39068 to K.F. Sullivan, and in part by a grant from the Markey Charitable Trust to the Department of Cell Biology.