(Received for publication, April 3, 1997)
From the Departments of Pediatrics,
Genetics,
and ** Internal Medicine, Yale University School of Medicine,
New Haven, Connecticut 06510 and the
Division of
Hematology/Oncology, Children's Hospital,
and Dana Farber Cancer Institute, Department of Pediatrics, Harvard
Medical School, Boston, Massachusetts 02115
Ankyrin-1 (ANK-1) is an erythrocyte membrane protein that is defective in many patients with hereditary spherocytosis, a common hemolytic anemia. In the red cell, ankyrin-1 provides the primary linkage between the membrane skeleton and the plasma membrane. To gain additional insight into the structure and function of this protein and to provide the necessary tools for further genetic studies of hereditary spherocytosis patients, we cloned the human ANK-1 chromosomal gene. Characterization of the ANK-1 gene genomic structure revealed that the erythroid transcript is composed of 42 exons distributed over ~160 kilobase pairs of DNA. Comparison of the genomic structure with the protein domains reveals a near-absolute correlation between the tandem repeats encoding the membrane-binding domain of ankyrin with the location of the intron/exon boundaries in the corresponding part of the gene. Erythroid stage-specific, complex patterns of alternative splicing were identified in the region encoding the regulatory domain of ankyrin-1. Novel brain-specific transcripts were also identified in this region, as well as in the "hinge" region between the membrane-binding and spectrin-binding domains. Utilization of alternative polyadenylation signals was found to be the basis for the previously described, stage-specific 9.0- and 7.2-kilobase pair transcripts of the ANK-1 gene.
Erythrocyte ankyrin, ankyrin-1, is the prototype of a family of homologous proteins that are involved in the local segregation of integral membrane proteins within function domains on the plasma membrane, linking the cytoplasmic domains of integral membrane proteins to the spectrin-actin based membrane skeleton (1-5). This important cellular localization of membrane proteins may be provided by the relative affinities of different isoforms of ankyrin for target proteins. This specialization appears to have evolved through the tissue-specific, developmentally regulated expression of multiple protein isoforms.
The molecular mechanisms by which ankyrin has acquired distinct isoforms with specialized functions are beginning to be revealed. The isoform diversity of ankyrin arises from different gene products and from differential, alternate splicing of the same gene product (6-19). The cDNAs for three human ankyrin proteins, ankyrin-1 (erythrocyte ankyrin, ANKR) (6, 7), ankyrin-2 (brain ankyrin, ANKB) (8), and ankyrin-3 (general ankyrin, ANKG) (11), have been cloned and their gene products studied. Ankyrins share a common protein structure consisting of an NH2-terminal membrane-binding domain, a spectrin/fodrin-binding domain, and a COOH-terminal regulatory domain (2-4). Ankyrin binding has been described for a variety of proteins including membrane skeleton proteins, ion transport proteins, and cell adhesion molecules (reviewed in Refs. 1-4).
The membrane-binding domain of ankyrin-1 is composed of 24 tandem repeats of approximately 33 amino acids folded into a nearly spherical structure (6, 7, 20). Homologous ankyrin repeats are found in many proteins with diverse functions and in virtually all cellular compartments (4, 21, 22). They have been found in a variety of organisms including yeast, viruses, bacteria, worms, plants, insects, and vertebrates. Ankyrin repeats interact with an assorted array of proteins suggesting that ankyrin repeats function as a general protein-binding motif. In ankyrin-1, the 24 tandem repeats associate into four independently folded subdomains, each comprised of six ankyrin repeats (23). Different combinations of ankyrin repeat subdomains and varying determinants of the same combination of repeat subdomains lead to diversity in mediating ankyrin-membrane protein binding (24, 25).
The region encoding the regulatory domain of ankyrin-1 is subject to alternate splicing (6, 7, 15-18). One of these alternate splices creates the "2.2" isoform of ankyrin-1 due to the deletion of 162 amino acids (6, 7). This 2.2 isoform exhibits greater affinity for spectrin and band 3 binding (26). These 162 amino acids may act as a repressor by binding back on the rest of the ankyrin-1 molecule leading to allosteric changes (27). Alternate splicing of the region encoding the very COOH terminus of ankyrin-1 produces several isoforms with varying carboxyl termini (6, 7, 15-18). The functional significance of these different isoforms is unknown. The alternately spliced, extreme COOH-terminal regions are among the few regions of the regulatory domain highly conserved in ankyrin-1 between man and mouse (12). In the erythrocyte, ankyrin-1 provides the primary linkage between the spectrin-actin based membrane skeleton and the plasma membrane by attaching tetramers of spectrin to the cytoplasmic domain of band 3, the anion exchanger (20, 28, 29). Ankyrin-1 has been implicated in many cases of hereditary spherocytosis (HS),1 a common inherited hemolytic anemia characterized by the presence of spherical red cells on peripheral blood smear with increased red cell osmotic fragility. The role of ankyrin in the pathogenesis of HS comes from a variety of sources including biochemical and genetic studies of HS patients (reviewed in Ref. 30). In some cases of HS, mutations in the ankyrin-1 gene (ANK-1) have been identified (31-37). An ankyrin-linked, murine model of HS, the nb/nb mouse, has also been described (38-43).
Ankyrin-1 is expressed not only in erythroid tissue but also in neural (6, 15-18, 43, 44) and skeletal muscle (18, 19) tissue. The primary structure of human ankyrin-1, deduced from sequence of clones obtained from a reticulocyte cDNA library, encodes a mature protein of 1881 amino acids (6, 7). Northern blot analyses demonstrated ankyrin-1 transcripts of 9 and 7.2 kb in erythroid tissues, with the 9-kb transcript predominating early in erythroid differentiation and the 7.2-kb transcript predominating in reticulocytes (6, 7, 40, 45). Only the 9-kb transcript is found in brain (6, 15, 40, 45). Multiple transcripts of varying size are present in muscle (9).
To gain additional insight into the structure and function of ankyrin-1
and to provide the necessary tools for further genetic studies of HS
patients, we cloned the chromosomal gene encoding ankyrin-1 and
characterized its genomic structure. We constructed composite human
erythroid ANK-1 cDNAs, including previously unpublished 3-untranslated sequences. In the region encoding the regulatory domain, we identified novel erythroid- and brain-specific transcripts of the human ANK-1 gene created by alternate splicing. The
molecular basis of the previously identified erythroid 9- and 7.2-kb
ANK-1 mRNA transcripts was identified to be mediated by
the developmental stage-specific use of alternate polyadenylation
signals.
Overlapping human ANK-1 cDNA fragments isolated and sequenced by Lux et al. (7) that correspond to the entire coding region were used as hybridization probes to screen a human genomic DNA library. The library is a Charon 4A bacteriophage library containing fragments of human genomic DNA partially digested with AluI and HaeIII with EcoRI linkers added (46). A second genomic DNA library, DuPont Merck Pharmaceutical Co. human foreskin fibroblast P1 library (number 1), was screened with two oligonucleotide primers, A and B (Table I) as described (47). These intronic primers flank exon 2 of the erythroid ANK-1 gene (see below) and amplify an ~450-bp fragment from genomic DNA. Selected recombinants or DNA fragments that hybridized to the screening probes were purified and subcloned into pGEM-7Z plasmid vectors (Promega Corp., Madison, WI). Subcloned fragments were analyzed by restriction endonuclease digestion, Southern blotting, and nucleotide sequencing. Prior to sequencing, some plasmid vectors were manipulated using the ExoIII unilateral direction technique (48).
|
Nucleotide sequencing was performed using the dideoxy chain termination method of Sanger et al. (49) with T7 DNA polymerase (Sequenase, U.S. Biochemical Corp.). The sequencing primers were the Sp6 or T7 vectors of the pGEM-7Z plasmid vector or, for some reactions, synthetic oligonucleotides corresponding to known cDNA sequences. Deoxyinosine triphosphate was substituted for deoxyguanosine triphosphate to resolve band compressions and ambiguities (50).
RNA IsolationTotal RNA was prepared from human fetal liver tissue and human bone marrow using the guanidinium-thiocyanate-chloroform method as described (51). Human reticulocyte RNA was prepared as described using acid precipitation (52).
3Total human fetal liver RNA (1 mg) was reverse-transcribed using avian myeloblastosis virus reverse transcriptase using an oligo(dT) adapter primer as described (53). One-tenth of the reversed-transcribed cDNA was amplified by PCR using an adapter primer and one of two gene-specific primers (E or F, Table I). Amplification products were subcloned and nucleotide sequences determined.
Preparation and Amplification of cDNA and cDNA LibrariescDNA was prepared by reverse transcription of total
fetal liver, bone marrow, or reticulocyte RNA using reverse
transcriptase of avian myeloblastosis virus. Primers C or D (Table I)
were used for reverse transcription. These cDNAs were used as
templates in PCR amplification using an automated DNA thermal cycler
(Perkin-Elmer) as described previously (52). In some reactions, human
cDNA libraries were used as templates in the polymerase chain
reaction. These included an oligo(dT)-primed human fetal liver cDNA
library in gt11 (54), an oligo(dt)-primed human bone marrow cDNA
library in
gt11 (CLONTECH, Palo Alto, CA), an
oligo(dT)-primed human reticulocyte cDNA library in
gt11 (a kind
gift of Dr. John Conboy), an oligo(dT)-primed human brain cDNA
library in
gt11 (CLONTECH), and an
oligo(dT)-primed human cerebellar cDNA library in
gt11 (CLONTECH). Oligonucleotide primers used in PCR
reactions are listed in Table I.
Multiple tissue Northern blots
containing 2 µg of poly (A)+ mRNA per tissue were
obtained from a commercial source (CLONTECH). Three
probes were used in Northern blotting. Probe 1 is pAnk15, an ~2.3-kb
ankyrin cDNA fragment of Lux et al. (7). Probe 2 is a
1.1-kb ApaI genomic DNA fragment that contains ~0.75 kb of the very 3 end of the ANK-1 cDNA. Probe 3 is a 2.0-kb
human
-actin cDNA fragment used as a control for loading in
Northern blot analyses (55).
Computer-assisted analyses of derived nucleotide and predicted amino acid sequences were performed utilizing the sequence analysis software package of the University of Wisconsin Genetics Computer Group (UW GCG; Madison, WI) (56) and the BLAST algorithm, National Center for Biotechnology Information (Bethesda, MD) (57).
Primary screening of the human genomic DNA library in
bacteriophage with cDNA probes yielded a large number of
hybridization-positive plaques. Selected recombinants were analyzed.
Twelve overlapping clones were identified that spanned over 190 kb of
genomic DNA containing the ANK-1 gene (Fig.
1). An EcoRI restriction map
of this 190-kb region is shown in Fig. 2.
Two additional clones,
217 and
261, were isolated that contained
ankyrin cDNA fragments but that did not overlap with the other
ankyrin clones or each other (Fig. 1). The PCR-based screening of the
human genomic DNA P1 library yielded four PCR-positive clones. Of these
clones, clone 2032 (DMPC-HFF#1-795 A9) was further analyzed (Fig. 1). This clone contained genomic DNA spanning >80 kb of DNA and covered all the previously nonoverlapping areas. An Xbal restriction
map of the 5
end of this clone that contains intron 1 (42 kb) and intron 2 (24 kb) is shown in Fig. 3.
Mapping the Exon/Intron Junctions of the ANK-1 Gene
The
erythroid transcript of the human ANK-1 gene is encoded by
42 exons (Table II). Twenty-seven of the
42 exons are relatively short, 132 bp in length. The first and
forty-second exons contain untranslated sequences. Comparison of the
exon/intron boundaries with reported consensus sequences reveals that
the ag:gt rule was not violated in any junction (58, 59). There are AG
dinucleotides within the 15 bp upstream of the 3
(acceptor) splice
junctions in five exons, 6, 16, 26, 33, and 40.
|
Comparison of the genomic structure with the protein
domains reveals a near absolute correlation between 20 of the 24 tandem repeats encoding the membrane-binding domain of ankyrin with the location of the intron/exon boundaries in the corresponding part of the
gene (Fig. 4). These tandem repeats of
ankyrin are thought to fold into four subdomains of six repeats each
(23). There is no correlation between the genomic structure and the
four subdomains; however, fused exons do not cross any of the four
subdomain boundaries. There is no correlation between genomic structure
and the subdomains of the spectrin-binding domain except for the
beginning of exon 26 and the beginning of the neutral subdomain (codon
913) as delineated by Platt et al. (29).
Polymorphisms of the ANK-1 Gene
A number of polymorphisms are
present in the ANK-1 gene. A highly polymorphic AC
dinucleotide is present in the 3-untranslated region beginning at
nucleotide 6589 (Fig. 1) (7, 60, 61). To date, we have identified eight
different alleles at this site by PCR typing according to the method of
Weber and May (62). An informative NcoI polymorphism,
initially described by Costa et al. (63, 64), has been
widely used in linkage analysis. We have mapped this polymorphism to a
location in intron 38 (Fig. 1). Differences observed in the nucleotide
sequences of the coding region derived from cDNA and genomic clones
are shown in Table III. Only one of these
polymorphisms, G to C at nucleotide 3049, changes an amino acid,
Val989-Leu. The locations of a number of other
polymorphisms present in the ankyrin cDNA have recently been
reported (31).
|
Multiple isoforms of the ankyrin-1 protein, designated 2.1, 2.2, 2.3, 2.4, 2.6, and 2.9, have been previously identified (4, 65-67). Band 2.1 is the predominant isoform, and band 2.2 is the most prominent minor species. cDNA cloning of the ANK-1 gene identified several mRNA isoforms that may encode different protein isoforms (6, 7). Two cDNA clones encoding the regulatory domain of ankyrin differed by a 486-bp in-frame deletion resulting in the deletion of 162 highly acidic amino acids. It has been shown that band 2.1 is encoded by the cDNA clone without the deletion, and band 2.2 is encoded by the clone with the deletion. We have previously shown that this "activated" band 2.2 isoform of ankyrin is created by the use of an alternate acceptor site in exon 38 (17).
Cloning from a human reticulocyte cDNA library also identified three isoforms with differing sequences at the COOH terminus of the protein (6, 7). Isoform 1 encodes an acidic COOH terminus 33 amino acids in length, isoform 2 encodes a basic COOH terminus 32 amino acids long, and the third isoform encodes a neutral COOH terminus ending in the last 8 amino acids of acidic isoform one. Knowledge of the exon/intron organization of the ANK-1 gene allows us to determine the precise molecular basis of these alternate splicing events. The pattern of splicing involves splicing of entire exons, partial splicing of exons, and piece meal splicing of individual exons. Isoform 1 is the full-length cDNA that encodes the COOH terminus of the major ankyrin 2.1 isoform, isoform 2 lacks 25 amino acids due to the use of an alternative acceptor splice site in exon 41, and isoform 3 is created by the same alternate acceptor splice site in exon 41 used in isoform 2 and the use of additional new alternate splice donor and acceptor sites in exon 41.
Using human fetal liver, bone marrow, and reticulocyte cDNA or
cDNA libraries as a template, we amplified this region of the ANK-1 cDNA encoding the regulatory region with primers G
and H (Table I). These primers are placed 5 of the 2.1/2.2 splice (G)
and in the 3
-untranslated region (H) and are designed to amplify
almost the entire regulatory region including the sites of all
previously described splices. Southern blot analysis of amplification
products using genomic DNA clones corresponding to the entire
regulatory region as hybridization probes is shown in Fig.
5A. This pattern of
amplification products is highly reproducible. Nucleotide sequencing of
shotgun subcloned amplification products identified 12 separate
erythroid isoforms (Isoforms 1-12, Fig. 5B), and
4 of these (1-3 and 5) are the isoforms cloned from reticulocyte DNA. Isoforms 1-12 were found in amplified fetal liver and bone marrow
cDNA; isoforms 9 and 10 were not found in amplified reticulocyte cDNA.
Novel Alternately Spliced Isoforms of ANK-1 mRNA Are Specific to Brain
Additional ANK-1 gene transcripts in the
region encoding the regulatory domain have previously been amplified
from human, murine, and rat brain cDNA (12-15). We looked for
these brain-specific cDNA transcripts with PCR using human brain
and cerebellar cDNA libraries as a template and, as above, primers
G and H (Fig. 5A). In brain, isoforms 5-8 and 13-15 were
found (Fig. 5B). Isoforms 13 and 14 contain a motif (motif
3, shown shaded black in Fig. 5B) present in exon
41 (motif 3, Fig. 6A) that is
expressed in neural but not erythroid tissue. This previously
unidentified motif encodes a peptide with a predicted molecular mass of
8.6 kDa and a pI of 8.3. This motif has a hydrophilic NH2
terminus and a hydrophobic COOH terminus. Data base searching with this motif did not reveal any significant homologies to known genes. At the
5 end of the motif, 66 of 68 nucleotides matched a sequence deposited
in the Non-redundant Data base of GenBank EST Division (dbEST
T48090).
An Alternately Spliced Exon of the ANK-1 mRNA at the "Hinge" between the Membrane-binding and Spectrin-binding Domains Expressed in Brain
We amplified the "hinge" region between
the membrane-binding and spectrin-binding domain using primers I + J
(Table I). A 24-bp insertion encoding an in-frame, 8-amino acid
sequence was identified. This sequence encodes a neutral peptide with a
predicted molecular mass of 843 Da and a pI of 7.5. Analysis of genomic DNA clones shows that this sequence (Fig.
7) is encoded by its own exon, designated 22a in Table II. The sequence
is nearly identical to that found in mouse and rat at the same location
(Fig. 7) (12, 15). There is no homology of this sequence with either of
the two short sequences that are inserted at the hinge between
membrane-binding and spectrin-binding domains of murine
ank-3 cDNA (13).
The 3
To
identify the 3 end of the human ANK-1 cDNA, 3
RACE
experiments were performed. The last 2 nucleotides of the most 3
end
of the previously reported sequence, GG (7), are actually AC, and are
followed by 1250 bp of additional sequence (Fig. 6B). Thus
exon 42 is 2503 bp in length. Two polyadenylation signals are present
in this sequence (see below). Remarkably, there was 76% similarity
between the human and murine ANK-1 3
-untranslated regions
over a 2561-bp region. There was no similarity between the human
ANK-1 gene 3
-untranslated region and the 3
-untranslated regions of the human ANK-2 gene, the human ANK-3
gene, or the murine ank-3 gene.
Polyadenylation signals are located at positions
7014-7019 and 8484-8489 bp in the 3-untranslated region of the
cDNA. To determine if both polyadenylation signals are utilized,
Northern blots were hybridized to cDNA sequences upstream and
downstream of the 5
polyadenylation signal. While Northern blots
clearly show the presence of two transcripts of ~9 and ~7.2 kb in
length in fetal liver RNA when the upstream probe 1 (Fig. 8A) is used (Fig.
8B), only a transcript of ~9 kb is detected when the
downstream probe 2 (Fig. 8A) is hybridized to the same
Northern blots (Fig. 8B). Thus, these transcripts are most
likely the result of alternate polyadenylation.
To provide additional evidence that these alternate polyadenylation
signals are utilized in mRNA, we performed 3 RACE with an
oligo(dT) primer (primer D, Table I) for reverse transcription using
total human fetal liver RNA as template. Gene-specific primers, E or F,
and a linker primer were used in PCR amplification of the cDNA.
Poly(A) tails were discovered 17 and 16 nucleotides, respectively,
downstream of the 5
and 3
polyadenylation signals (Fig. 9). Thus the molecular basis of the
developmental stage-specific 9- and 7.2-kb transcripts is mediated via
the use of alternate polyadenylation signals in the 3
-untranslated
region of the ANK-1 cDNA.
Genetic analyses of patients with hereditary spherocytosis have been previously hampered by the lack of knowledge of the sequences of the ANK-1 chromosomal gene. A variety of mutations causing human disease have been described that affect RNA processing and translation (68). Many of these mutations are associated with dramatic decreases in steady state mRNA levels. This observation has important implications for the methodologies employed in mutation detection. Reverse-transcriptase PCR-based techniques are unlikely to detect the mutations with decreased mRNA levels, necessitating study of these mutations at the genomic DNA level (69). Characterization of the genomic structure of the erythroid transcript of ANK-1 allows structural studies of the ANK-1 gene in patients with hereditary spherocytosis using genomic DNA.
Ankyrin repeats (also referred to as cdc 10 repeats, cdc10/SWI6
repeats, and SWI6/ANK repeats) are found in varying numbers in a large
number of functionally distinct proteins involved in assorted molecular
associations including protein-protein, intramolecular, and DNA
interactions (20, 21). Determination of the crystal structure of 53BP2,
an ankyrin repeat-containing protein identified to bind the p53 tumor
suppressor in vitro, revealed that individual ankyrin
repeats have an L-shaped structure consisting of a -hairpin followed
by 2
-helices that pack in an antiparallel fashion (70). Adjacent
repeats pack via both their
-hairpins, forming a continuous
-sheet, and via their helix pairs, forming helix bundles. Because the folding of an ankyrin repeat appears to depend on the presence of
adjacent repeats, it seems unlikely that individual repeats are able to
fold normally. This is compatible with the observation that almost all
ankyrin repeat-containing proteins have four or more repeats (21). As
revealed by the crystal structure, the nonglobular structure of ankyrin
repeats offers many possibilities for diverse macromolecular
interactions (70). This diversity, as well as specificity, is enhanced
by variations in amino acid sequence of individual ankyrin repeats and
their flanking sequences.
Prior to the availability of the crystal structure of an ankyrin
repeat, attempts to identify the phasing of ankyrin repeats were
complicated by the fact that there was no evidence of single repeats
encoded by discrete exons. Intron/exon boundaries were found at varying
locations within ankyrin repeats of different genes (71-76). The
crystal structure suggests that ankyrin repeats begin with the
consensus sequence -(N/D)- - - - - G-TPLH-AA (dashes indicate nonconserved amino acids) (70). Several ankyrin
repeat-containing proteins, such as forked and plutonium of
Drosophila, begin their repeat domains exactly with this
phasing (71, 72). All three human ankyrins begin and end their repeat
domains (repeats 1 and 24) with "partial" repeats (6-8, 11).
Closer inspection suggests that these sequences are likely to be the
initial -strand of the
-hairpin of the first ankyrin repeat.
Interestingly, comparison of the repeats in ankyrin-1, ankyrin-2, and
ankyrin-3 shows that 10 residues of the repeat, TPLH-AA- - G, are
highly conserved. These conserved residues are predicted to be located
in the first
-helix with individual residues playing critical roles;
e.g. Thr in initiating
-helices, Gly in terminating
-helices, and His in supporting inter-repeat stabilization.
The sequence of ankyrin repeats in members of similar gene families are
highly conserved, suggesting a common ancestor prior to divergence of
individual members of a gene family. The ankyrin repeats of
IKBA, BCL3, and NFKB2, members of the
IB
family, are highly conserved, as are the genomic structures of
these three genes (74-76). The repeats of human ankyrin-1, ankyrin-2,
and ankyrin-3 are also highly conserved. Although the genomic
structures of the human ANK-2 and ANK-3 genes
have not been reported, the intron/exon boundaries of a single
ANK-2 gene exon (77) correspond exactly to the locations of
intron/exon boundaries in the ANK-1 gene. Conservation of
ankyrin repeats, as well as conservation of regions critical for
spectrin binding, together provide strong evidence for a common origin
for the different ankyrin genes, diverging after the formation of the
membrane-binding and spectrin-binding domains.
The data presented here suggest that there is developmental stage- and
tissue-specific diversity in ANK-1 cDNA transcripts. It
is likely that additional transcripts have yet to be discovered. These
transcripts, if they are translated into functional proteins, may well
play important roles in ankyrin-1 function, either as linker/adapter
molecules or in other yet undiscovered capacities. It will be
interesting to determine if these isoforms encode the previously
described 2.3, 2.4, 2.6, and 2.9 ankyrin isoforms and the other
isoforms detected on Western blots (4, 66). The pattern of multiple
ankyrin isoform expression observed due to alternate splicing is
similar to that observed for other red cell membrane proteins such as
-spectrin, protein 4.1, and tropomyosin (78-81).
Cleavage of primary mRNA transcripts and the addition of poly(A) to
the newly formed 3 end of the transcript downstream of the highly
conserved polyadenylation signal AAUAAA are features of almost all
eukaryotic mRNAs (82). 3
-Untranslated regions can have an
important influence on mRNA function including translation, localization, stability, and gene transcription (82, 83). Alternate
polyadenylation of 3
-untranslated regions may play a role in
determining developmental or tissue-specific preferences for various
mRNA transcripts (83, 84). For example, alternative polyadenylation
of the murine
-tropomyosin mRNA determines
differentiation-dependent transcripts in BC3Hl
muscle cells (85). The role(s) of the two ANK-1 gene
transcripts that vary in their 3
-untranslated regions is (are)
unknown, but considering the remarkable homology between the
3
-untranslated regions of human and murine erythrocyte ankyrin gene,
it seems likely that this region has an important function.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U49691, U50092-U50133.