(Received for publication, April 11, 1997, and in revised form, June 6, 1997)
From the Oncology Center and Department of Medicine, The
Johns Hopkins University, Baltimore, Maryland 21231
Promoter region CpG island methylation is associated with tumor suppressor gene silencing in neoplasia. GenBank sequence analyses revealed that a number of CpG islands are juxtaposed to multiple Alu repeats, which have been proposed as "de novo methylation centers." These islands also contain multiple Sp1 elements located upstream and downstream of transcription start, which have been shown to protect CpG islands from methylation. We mapped the methylation patterns of the E-cadherin (E-cad) and von Hippel-Lindau (VHL) tumor suppressor gene CpG island regions in normal and neoplastic cells. Although unmethylated in normal tissue, these islands were embedded between densely methylated flanking regions containing multiple Alu repeats. These methylated flanks were segregated from the unmethylated, island CpG sites by Sp1-rich boundary regions. Finally, in human fibroblasts overexpressing DNA methyltransferase, de novo methylation of the E-cad CpG island initially involved sequences at both ends of the island and the adjacent, flanking regions and progressed with time to encompass the entire CpG island region. Together, these data suggest that boundaries exist at both ends of a CpG island to maintain the unmethylated state in normal tissue and that these boundaries may be progressively overridden, eliciting the de novo methylation associated with tumor suppressor gene silencing in neoplasia.
The CpG dinucleotide is underrepresented in the mammalian genome except in clusters known as CpG islands. CpG islands are generally unmethylated in normal tissues, irrespective of gene transcription status, while non-island CpG dinucleotides in bulk chromatin are often methylated (1). In normal tissues, extensive methylation of promoter region CpG islands is exclusively associated with transcriptional silencing of imprinted alleles and genes on the inactive X chromosome (reviewed in Ref. 2). In neoplasia, these patterns of methylation are often altered. Non-island CpGs in bulk chromatin may become hypomethylated, while CpG islands can become densely methylated (1-4). Indeed, aberrant DNA methylation of promoter region CpG islands can serve as an alternative to coding region mutation for the inactivation of tumor suppressor genes, including the retinoblastoma gene (Rb), the von Hippel-Lindau gene (VHL), p16INK4A, p15INK4B, and E-cadherin (E-cad) (5-14).
The establishment of regional methylation patterns in normal and
neoplastic tissue has not been clearly defined. Recent work has
demonstrated that the promoter region CpG island of the adenine phosphoribosyltransferase gene (aprt) is normally protected
from methylation by a cluster of Sp1 elements located upstream of the transcription start site (15-18). Disruption of these Sp1 elements facilitates de novo methylation of the aprt
promoter (15, 16, 18), which may spread from an upstream
"methylation center" comprising B1 repetitive elements (17, 18).
Therefore, the rodent aprt CpG island remains
unmethylated as a function of protective, cis-acting
elements (e.g. Sp1 elements or G/C boxes) despite the influence of normally methylated, repetitive elements immediately 5 to
the CpG island. It is not clear whether other CpG islands are similarly
juxtaposed to normally methylated, repetitive elements (i.e.
putative methylation centers) and whether such CpG islands may be
protected from methylation by cis-acting elements
(e.g. Sp1 elements).
To gain insight into how CpG islands remain methylation-free in normal tissue and how these islands become aberrantly methylated in neoplasia, we analyzed a number of CpG island sequences and found that proximity to Alu repetitive elements and the position of multiple Sp1 elements both upstream and downstream of transcription start are common features of many CpG islands. We have combined methylation-specific PCR1 and bisulfite-modified genomic sequencing to provide a detailed map of the methylation patterns in and around the E-cad and VHL tumor suppressor gene CpG islands in normal tissue and tumor cell lines. Finally, we examined the time-dependent, de novo methylation of the endogenous E-cad CpG island in fibroblasts engineered to overexpress human DNA MTase.
DNA from normal breast epithelial tissue was kindly provided by Drs. Rena G. Lapidus, Nancy E. Davidson, Helene Smith, and Sigmund Weitzman. The renal carcinoma cell line, RFX393, was kindly provided by Dr. Michael Lerman. Genomic DNA was isolated from cell lines as described previously (14, 19). The generation and characterization of SV40-immortalized, IMR90 fetal lung fibroblasts expressing 40-50-fold increased DNA MTase activity (HMT clones) and the neomycin-resistant transfection controls (Neo clones) have been described (19). The DNA used in this study was isolated from the Neo.1 and Neo.20 clones at cell passages 6, 20, and 39, from the HMT.19 clone at cell passages 6, 20, and 37 and from the HMT.1E1 clone at cell passages 6, 27, and 34.
Bisulfite Modification and Methylation-specific PCR AnalysisGenomic DNA was modified by bisulfite treatment as
detailed previously (20, 21). All primer pairs (Table I) for
methylation-specific PCR analysis (MSP) of the E-cad and
VHL CpG island regions were purchased from Life Technologies
Inc. Each unmethylated/methylated primer pair set was routinely
engineered to assess the methylation status of four to six CpG
dinucleotides with at least one CpG dinucleotide positioned at the 3
end of each primer to facilitate maximal discrimination between
methylated and unmethylated alleles following bisulfite modification
(20). MSP primers for the Alu repeats upstream and downstream of the
VHL CpG island were designed with a common antisense primer
in a region devoid of CpGs to assess the methylation status of CpG
sites located only within the Alu sequences (Table I).
|
Genomic sequencing of
bisulfite-modified DNAs (22) was performed by solid phase DNA
sequencing (21). The first amplification of sequences at the 5 edge of
the E-cad island was performed for 38 cycles with a 57 °C
annealing temperature: sense, 5
-AATAGGTTGAGATAGGAGAGTTTTT-3
(beginning at nucleotide 209, GenBank sequence accession no. L34545, Ref. 23); antisense, 5
- CTAATTAACTAAAAATTCACCTACC-3
(beginning at
nucleotide 956, accession no. L34545). To obtain products for
sequencing, a second round of PCR was performed for 29 cycles with 5 pmol of nested primers: sense, 5
- ATAGGAGAGTTTTTTGAATTTG-3
(beginning
with nucleotide 220, accession no. L34545); antisense, 5
-ACCACAACCAATCAACAAC-3
(beginning with nucleotide 939, accession no.
L34545). Bisulfite-modified DNA was also used to amplify the region at
the 3
edge of the island for 35 cycles using the following primers:
sense, 5
-TTTYGGTTTAAGGAAAGTGG-3
(sequence position 530, accession no.
L36526); antisense, 5
-CCCTCACCTCTACCCAAAAC-3
(sequence position
967, accession no. L34937). To obtain products for sequencing, a second
round of PCR was performed for 26 cycles using 5 pmol nested primers:
sense, 5
-GGAAAGTGGGGTTTTGGA-3
(sequence position 541, accession no.
L36526); antisense, 5
- RCRACCTCTCTCCAAATAAC-3
(sequence
position 136, accession no. L34937).
Multiple Sp1 elements at the 5 edge of the rodent
aprt CpG island protect the island from methylation
(15-18), which spreads from normally methylated, upstream tandem B1
repetitive elements (17, 18). To determine whether other CpG island
regions may be similarly constituted, we examined the GenBank sequences
for a number of 5
CpG island regions, especially those in the promoter regions of genes involved in neoplasia. Maps of the 5
CpG island regions of the human APRT (accession no. U09817), E-cadherin (accession
nos. L34545, L36526, and L34937), glutathione S-transferase
(accession no. X08058), tissue inhibitor of metalloprotease II
(TIMP-2; accession no. U44381), neurofibromatosis-1 (NF-1; accession
nos. U17084 and U09106), and von Hippel-Lindau (VHL;
accession nos. U19763, L15409, and U68055) are represented in Fig.
1. For each of these CpG islands,
multiple Alu repeats were located immediately 5
to the CpG island with the most proximal Alu located within ~1 kilobase upstream of
transcription start. Furthermore, each CpG island contained multiple
Sp1 elements located both upstream and downstream of transcription
start (Fig. 1). Genomic regions 3
to these CpG islands did not contain
enough deposited sequence to evaluate the presence of Alu repeats,
except for the VHL CpG island region, which has Alu repeats
both upstream and downstream of the CpG island (Fig. 1). The promoter
region CpG islands of the Rb and estrogen receptor genes are
devoid of any proximal Alu repetitive elements (data not shown). Hence, the proximity of multiple Alu repeats may be common to many CpG islands, but is not universal.
Methylation Patterns of the E-cadherin 5
To understand how these sequences may be related to
patterns of methylation in normal tissue and in neoplasia, we mapped
the methylation status of the 2.2-kilobase region encompassing the entire E-cad CpG island and flanking, non-island sequences
(Fig. 2A) in normal breast
epithelia and breast tumor cell lines. To identify the critical areas
within this region that may participate in establishing normal and
aberrant patterns of methylation, we employed the recently developed
MSP, which can readily identify methylated alleles comprising as little
as 0.1% of the total sample (20). Once identified, these targeted
areas were examined in greater detail by bisulfite-modified, genomic
sequencing in select normal and tumor cell samples.
The primer sets used to examine the E-cad CpG island region
covered 33 of 138 CpG sites throughout the region (listed in Table I, depicted in Fig. 2A). Since
unmethylated and methylated primers for these MSP primer sets were
directed to the same region of the genome, amplify similarly sized
products, and, in most cases, were designed with identical annealing
temperatures, the unmethylated and methylated primer sets of each pair
amplify with similar efficiency (20). To demonstrate this, we assayed
mixtures of methylated and unmethylated DNA with various primer sets
directed to the area of highest CpG density within the E-cad
CpG island (Fig. 1). The results for island set 3 (Fig.
3A) exemplify that MSP has the
resolution to define a population of alleles as completely methylated
or unmethylated, predominantly methylated or unmethylated, or
comprising both methylated and unmethylated alleles.
In normal breast epithelia and E-cad-expressing breast
cancer cell lines (MCF-7, T47D, and ZR-75; Ref. 24), CpG sites within the island were completely unmethylated (island primer sets 1-4, Fig.
3B), while both upstream Alu repeats were extensively
methylated (primer sets Ali1 and Alu2, Fig.
3C). The flanking, non-island CpG sites in exon 2 were also
extensively methylated in the E-cad-expressing cell lines
and in normal breast epithelia (Island primer set 6, Fig.
3B), which also displayed methylation at the 3 edge of the CpG island (island set 5, Fig. 3B). By contrast, the
E-cad-negative breast tumor cell lines (MDA-MB-231, Hs578t,
MDA-MB-435, MCF7ADR, and HBL100; Ref. 24) showed extensive
methylation of the upstream Alu repeats (Fig. 3C), and
virtually all CpG sites examined within the island (representative data
for these cell lines are depicted in Fig. 3B), particularly
those in the region of highest CpG density near the transcription start
site (island set 3, Fig. 3B). Therefore, in normal tissue
and E-cad-positive tumor cells, the CpG island is
unmethylated but is embedded between regions of dense methylation, while the entire region is densely methylated in the
E-cad-negative breast tumor cell lines (summarized in Fig.
2B).
In normal breast epithelia and
E-cad-expressing tumor cell lines, MSP analyses indicated
that the E-cad 5 CpG island has both 5
and 3
boundaries
delineating methylated, flanking region CpG sites from the unmethylated
CpG sites within the island (summarized in Fig. 2B). To
define precisely these borders, we used bisulfite-modified genomic
sequencing to determine the methylation status of 21 CpG sites within
the putative 5
border region and 24 CpG sites within the 3
border
region of the E-cad CpG island. A sharp boundary between
unmethylated CpG sites and methylated, or partially methylated, CpG
sites exists at both the 5
and 3
regions of the island coincident with clusters of Sp1 elements, in the regions of declining CpG density
(data from normal breast epithelia are depicted in Fig. 4).
Methylation Patterns of the VHL 5
To
determine whether the patterns of methylation for the E-cad
CpG island were shared by other CpG islands, we examined the VHL CpG island region by MSP in normal kidney tissue and a
renal carcinoma cell line, RFX393, which does not express
VHL (data not shown). The VHL MSP primer sets
(Table I) covered 27 of 120 CpG sites throughout the CpG island region
(Fig. 5A). In RFX393, the
entire VHL CpG island region, including upstream and
downstream Alu repeats, was densely methylated. Likewise, in normal
kidney, the Alu sequences both 5 and 3
to the island were extensively methylated but the CpG island was completely unmethylated (Fig. 5B). Therefore, like the E-cad CpG island region,
the VHL CpG island was unmethylated in normal tissue
(summarized in Fig. 5C) but was flanked by regions of dense
methylation (i.e. Alu repeats). Similarly, the proximal Alu
repeats upstream of the human APRT and TIMP-2
genes were also extensively methylated (illustrated in Fig. 1) while
the central region of each island was unmethylated in breast tumor cell
lines (data not shown).
The Evolution of Aberrant Methylation in the E-cad CpG Island
There is currently little information regarding the evolution of aberrant, de novo methylation of endogenous CpG islands. We have previously demonstrated that overexpression of DNA MTase in SV40-immortalized, IMR-90 fibroblasts can drive the de novo methylation of certain CpG islands, including the E-cad CpG island (19). To examine the time-dependent evolution of aberrant methylation of the E-cad CpG island, we compared the methylation status of the E-cad CpG island and flanking sequences in two Neo control clones and two HMT clones over the course of 40 cell passages.
As in normal breast epithelia, the non-island CpG sites upstream
(Alu2) and downstream of the E-cad CpG island
(island set 6, exon 2) were extensively methylated in the Neo and HMT
clones (Fig. 6). In the Neo clones, the
regions within the CpG island were unmethylated in the early passage
samples (island sets 1-5; Fig. 6). By mid-passage, these regions
generally remained unmethylated in the Neo clones, although some
methylation was evident within the 3 edge of the island in both clones
(island set 5, Fig. 6) and at the 5
edge of the island in Neo.20
(island set 1, Fig. 6). These patterns of methylation remained
relatively constant between passages 20 and 39 with the region of
greatest CpG density remaining unmethylated (island sets 2, 3, and 4),
despite methylation at the fringes of the CpG island (island sets 1 and
5, Fig. 6) and dense methylation within the flanking, non-island
sequences (summarized in Fig. 7).
By contrast, the E-cad CpG island became progressively more
methylated with time in two independent cell clones that overexpress human DNA MTase (HMT.19 and HMT.1E1). In the early passage HMT clones,
methylation was evident in all regions examined throughout the island
except in the heart of the island near the transcription start site
(island 3, Fig. 6). By mid-passage, methylation within the CpG island
had become more prominent (island sets 1-5, Fig. 6) and, for HMT.1E1,
even the area around the transcription start site was almost completely
methylated (island set 3, Fig. 6). By late passage, all of the regions
examined within the island were predominantly methylated in both HMT
clones, including the region spanning the transcription start site
(Fig. 6). These data indicate that aberrant methylation first involved
both the 5 and 3
edges of the E-cad CpG island and the
flanking, non-island sequences and, with time, progressively extended
throughout the entire CpG island to include the central area of highest
CpG density near the transcription start site (summarized in Fig.
7).
In the present study, we have sought to understand how CpG islands
may be protected from methylation in normal tissue and how aberrant CpG
island methylation may develop in neoplastic cells. We show that the
CpG islands and flanking sequences of E-cad and
VHL are extensively methylated in tumor cell lines for which
gene expression is extinguished. In normal tissue, both CpG islands are
unmethylated but are immediately flanked by regions of dense
methylation, containing Alu repeats. Additionally, for E-cad, the regions that mark the boundaries between
unmethylated, island CpG sites and the methylated, flanking region CpG
sites contain multiple Sp1 elements located at the 5 and 3
edges of the island, where CpG density declines. GenBank sequence analysis revealed that other CpG islands are also positioned immediately downstream from Alu-rich, non-island flanking regions and have multiple
Sp1 elements located both 5
and 3
to transcription start (Fig. 1). It
seems plausible that these sequence characteristics may participate in
establishing both normal and aberrant patterns of methylation in these
CpG island regions.
In the rodent aprt CpG island, disruption of a cluster of
Sp1 sites (or G/C boxes) located at the 5 end of the island elicits de novo methylation of the aprt CpG island (15,
16), even when Sp1-mediated transcription is not disrupted (18). The
de novo methylation of the mouse aprt CpG island
appears to originate within, and spread from, normally methylated B1
repetitive elements located immediately 5
to the island (17, 18).
Turker and Bestor (26) have proposed that such repetitive sequences,
which are frequently methylated in normal tissue (27, 28), may act as
de novo methylation centers, i.e. cis-acting
elements methylated in normal tissue from which methylation may spread
bidirectionally into adjacent sequences. Such spreading of preimposed
methylation patterns into adjacent sequences has been documented (29).
Furthermore, an Alu element in intron 6 of the human p53 gene has been
shown to act as such a methylation center, directing the ubiquitous methylation of the CpG site in codon 248 (30). B1 and
B2 repetitive elements, the rodent equivalent to the human
Alu family of repetitive sequences (31), may also act as a methylation
center directing the de novo methylation of the rat
-fetoprotein gene CpG island (32).
Our data show that the Alu repeats within the CpG island regions of
E-cad and VHL are extensively methylated in
normal tissue and tumor cell lines, regardless of gene expression
status or the tissue of origin. Additionally, we show that multiple Sp1 sites exist within both the 5 and 3
edges of the E-cad and
VHL CpG islands marking the boundary between the
unmethylated island CpG sites and the methylated, non-island CpG sites.
Furthermore, our multi-gene sequence analysis (Fig. 1) shows that the
Sp1 elements present in most CpG islands (33) are often located both
upstream and downstream of transcription start, perhaps protecting
these islands from the spread of methylation originating in either
flanking region. Consistent with this possibility, we show that
de novo methylation of the E-cad CpG island in
fibroblasts overexpressing DNA MTase begins within both flanking
regions and in sequences at both edges of the island, progressing with
cell passage to include the central region of highest CpG density
within the island (summarized in Fig. 7).
Several scenarios may explain these results. For instance, these data are consistent with the hypothesis that the normally methylated, sequences (e.g. Alu repetitive elements) flanking the E-cad CpG island may act as methylation centers directing the spread of methylation toward the island. Alternatively, since sequences with highest CpG content may be inherently more resistant to the action of DNA MTase (34), it is possible that the lateral regions of the island, which are the least CpG-rich, may be better substrates for DNA MTase than the area of highest CpG density. Therefore, methylation may accumulate more readily in these border regions, independent of the methylation status of the flanking, non-island sequences. While we cannot currently distinguish between these two possibilities, it is clear that the region of highest CpG density remained most resistant to methylation.
In summary, the data in this report, in conjunction with previous
reports on the APRT CpG island (15-19, 26), suggest that CpG islands
may commonly be juxtaposed with densely methylated, Alu-rich regions
and may be protected from the influence of these methylated flanking
sequences by clusters of Sp1 elements at both the 5 and 3
sides of
the island. During tumorigenesis, the protection mediated by these
Sp1-rich barrier regions erodes, perhaps subsequent to a decrement in
transcription factor activity (14) and/or dysregulated DNA MTase
activity. Consequently, methylation may progressively spread from
normally methylated, flanking regions (i.e. methylation
centers) into an adjacent CpG island. The data in this report relating
sequence features common to a number of CpG islands to the patterns of
CpG island methylation in normal and neoplastic tissue may provide
insight for elucidating the mechanisms underlying
methylation-associated tumor suppressor gene silencing during tumor
evolution.
We thank Drs. Rena G. Lapidus, Nancy Davidson, Sigmund Weitzman, Helene Smith, and Michael Lerman for providing DNA from normal breast epithelia, some breast cancer cell lines and the renal carcinoma cell line RFX393. We also thank Tammy Means for secretarial assistance. Finally, we thank Dr. M. S. Turker for helpful discussion and critical review of this manuscript.