(Received for publication, April 24, 1995; and in revised form, September 6, 1995)
From the
cDNA clones encoding proteins related to the aggrecan/versican family of proteoglycan core proteins have been isolated with antisera against rat brain synaptic junctions. Two sets of overlapping cDNAs have been characterized that differ in their 3`-terminal regions. Northern analyses with probes derived from unique regions of each set were found to hybridize with two brain-specific transcripts of 3.3 and 3.6 kilobases (kb). The 3.6-kb transcript encodes a polypeptide that exhibits 82% sequence identity with bovine brevican and is thought to be the rat ortholog of brevican. Interestingly, the polypeptide deduced from the open reading frame of the 3.3-kb transcript is truncated just carboxyl-terminal of the central domain of brevican and instead contains a putative glypiation signal. Antibodies raised against a bacterially expressed glutathione S-transferase-brevican fusion protein have been used to show that both soluble and membrane-bound brevican isoforms exist. Treatment of the crude membrane fraction and purified synaptic plasma membranes with phosphatidylinositol-specific phospholipase C revealed that isoforms of brevican are indeed glycosylphosphatidylinositol-anchored to the plasma membrane. Moreover, digestions with chondroitinase ABC have indicated that rat brevican, like its bovine ortholog, is a conditional chondroitin sulfate proteoglycan. Immunohistochemical studies have shown that brevican is widely distributed in the brain and is localized extracellularly. During postnatal development, amounts of both soluble and phosphatidylinositol-specific phospholipase C-sensitive isoforms increase, suggesting a role for brevican in the terminally differentiating and the adult nervous system.
Morphogenesis and differentiation as well as functional
plasticity in the brain involve a variety of different interactions
between neurons and their environment. The extracellular matrix (ECM), ()a complex agglomerate of glycoproteins and proteoglycans
located in the extracellular space, is prominently involved in these
interactions. More than 20 different proteoglycan core proteins have
been reported to occur in the developing and adult rat
brain(1) , most of them bearing either chondroitin sulfate or
heparan sulfate as glycosaminglycan (GAG) moieties. A variety of
different genes encodes brain proteoglycan core proteins that can occur
as secreted molecules, such as CAT 301 (2) or the T1
antigen(3) , as integral membrane proteins, e.g. NG2 (4) or N-syndecan(5) , or as membrane-anchored
forms, such as glypican (6, 7) or
cerebroglycan(8) . The diversity of proteoglycan core proteins
is further amplified by alternative processing at the
post-transcriptional and post-translational levels (for review, see (9) ).
One of the most intensely studied families of proteoglycans is the aggrecan/versican family. Members of this family include the cartilage proteoglycan aggrecan(10, 11, 12) , versican, a molecule originally identified in fibroblasts (13) but also highly expressed in the central nervous system(14) , as well as the brain-specific proteoglycans neurocan (15) and brevican(16) . They share a number of structural features. The amino-terminal domains mediate the interaction with hyaluronic acid (HA) as well as with the link protein, a relatively small polypeptide involved in the aggregation of proteoglycans(17) . The link protein consists basically of a HA binding domain. BEHAB, a protein deduced from its cDNA sequence, was suggested to constitute a brain-specific link protein(18, 19) .
The domain structure of the core proteins of aggrecan/versican family members is reflected at the genomic level. The genes for rat (20) and mouse aggrecan (21) as well as human versican (22) have a very similar exon/intron organization that resembles the arrangement of functional domains of the corresponding proteins.
All known members of the aggrecan/versican family are soluble chondroitin sulfate proteoglycans, whereas most of the membrane-spanning or glycosylphosphatidylinositol (GPI)-anchored proteoglycans carry heparan sulfate or keratan sulfate side chains(9) . Here we report the identification and characterization of two sets of cDNAs encoding isoforms of a member of the aggrecan/versican family of proteoglycan core proteins expressed in the rat brain. One isoform constitutes most likely the rat homolog of bovine soluble brevican, whereas another group of isoforms represents the first example of GPI-anchored proteins of the aggrecan/versican family. GPI-anchored brevican isoforms are up-regulated late during postnatal development.
Figure 1: Physical map and overlapping cDNA clones encoding brevican isoforms. The structural features of brevican isoforms (boxed) are indicated with respect to their mRNAs. Numbering of amino acids is as in Fig. 2. Abbreviations are as follows: Ig, immunoglobulin; PTR, proteoglycan tandem repeat; CRP, complement regulatory protein. The extension of four fully analyzed cDNAs is shown. Clone 37c2/12 was used to produce MS2 polymerase and glutathione S-transferase fusion proteins of brevican. The sequence deviation of clone 37c2/10 from other brevican cDNAs is indicated by a boldface line. The extension of PCR probes derived from the unique regions of the 3.6-kb mRNA (probe 1) and the 3.3-kb mRNA (probe 2) for the detection of these transcripts on Northern blots (compare Fig. 3) is indicated.
Figure 2: Nucleotide sequence and deduced amino acid sequence of brevican isoforms. A, sequence of the cDNAs encoding the secreted rat brevican isoform. The putative signal peptide is underlined; potential N-glycosylation sites are indicated by an open triangle; Ser-Gly and Gly-Ser dipeptide sequences representing potential chondroitin sulfate attachment sites are double underlined with dashed lines; cysteine residues conserved between rat and bovine brevican (16) are indicated by dots. B, sequence of cDNA clone 37c2/10 spanning the unique 3` region that encodes the glypiation signal of GPI-anchored brevican. The beginning of the specific sequence at nucleotide 1982 is indicated. The potential splice donor site is dotted underlined. The Ser residue serving as potential site of GPI anchor addition is marked by a filled triangle. The nucleotide sequence of A is available from the GenBank/EMBL/DDBJ data bases under the accession number X79881. The sequence of B has been submitted.
Figure 3: Northern analysis of brevican transcripts. Nylon filters containing 20 µg/lane total RNA from 30-day-old rats were hybridized with radiolabeled 37c2 cDNA (lanes 1-8) or with specific PCR products derived from nucleotides 2326-2618 of the 3.6-kb cDNA (lane 9; probe 1 in Fig. 1) or from nucleotides 1997-2283 of the 3.3-kb cDNA (lane 10; probe 2 in Fig. 1). Tissue distribution was as follows: lane 1, brain stem, striatum, thalamus, and hypothalamus; lane 2, hippocampus; lane 3, cerebellum; lane 4, cerebral cortex; lane 5, liver; lane 6, heart; lane 7, skeletal muscle; lane 8, C6 glioma cells; lanes 9 and 10, cerebral cortex. Note, to detect the 3.3-kb transcript, the filter showing lanes 1-8 had to be overexposed. A shorter exposure visualizes only the 3.6-kb transcript (compare (18) ). The hybridization signals to total RNA from cerebral cortex (lanes 9 and 10) demonstrate that cDNA clone 37c2/15 derives from the 3.6-kb mRNA, whereas clone 37c2/10 originates from the 3.3-kb mRNA.
Separation of proteins by SDS-polyacrylamide gel electrophoresis on 5-20% gels under fully reducing conditions and transfer onto nitrocellulose were performed as described previously(24) . Western blots were immunodeveloped by overnight incubation with primary antibody and processed employing the ECL detection system (Amersham Corp.).
Phospholipase treatment was carried out using 0.6 units of PI-PLC (Oxford GlycoSystems) and 0.03 units of chondroitinase ABC/30 µg of membrane proteins as described by (29) .
Various functional domains can be identified in the protein encoded by the 37c2 cDNAs based on sequence homology to known proteins (Fig. 1). These include an Ig-like fold (amino acids 12-135) and two proteoglycan tandem repeats (amino acids 136-236 and 237-337). All three elements are supposed to act as HA binding region (for review see (14) ). Amino acid residues 338-601 represent a region unique to the 37c2 protein. This central domain is relatively rich in glutamic acid residues, which account for 16.5% of all amino acid residues in this region. The carboxyl-terminal portion harbors a single EGF-like repeat (amino acids 602-635), a lectin-like domain (amino acids 636-764), and a complement regulatory protein-like domain (amino acids 765-829). Two potential N-glycosylation sites are located at amino acid positions Asn-107 and Asn-314 (Fig. 2A). In total, 13 Ser-Gly or Gly-Ser dipeptides, potentially serving as GAG attachment sites, occur in the deduced amino acid sequence. Five of these dipeptides have at least one acidic amino acid in the vicinity of either side, which is thought to be essential for GAG attachment(13) . Four of these five sites are located in the Glu-rich central domain where GAG attachment to members of the aggrecan/versican family of proteoglycans is thought to occur.
Among the members of the aggrecan/versican family, bovine brevican (16) is most closely related to the protein encoded by the 37c2 cDNAs. The overall sequence identity of the two proteins is 82% with the highest degree of identity located in the HA binding domain (90%) and the carboxyl-terminal homology region comprising the EGF-repeat, the lectin-like domain, and the complement regulatory protein-like domain (84%). The central domain is least conserved, both in length (274 amino acids for the 37c2 protein, 298 amino acids for bovine brevican) and sequence identity (72%), although in both proteins this region is relatively rich in glutamic acid residues. The positions of nine out of 13 Ser-Gly or Gly-Ser dipeptides, the two potential N-glycosylation sites, and all cysteine residues that determine the structure of functional domains are conserved between the 37c2 protein and brevican (Fig. 2A). Thus, the two proteins are likely to be species homologs, and we refer to the new protein as rat brevican.
The nucleotide sequence of one of the isolated cDNAs, clone 37c2/10, differed from that of the other cDNAs starting from nucleotide 1982 and contains 553 additional nucleotides ( Fig. 1and 2B). A nested reverse transcriptase PCR assay confirmed that a corresponding transcript exists in total brain RNA and the cDNA was not artificially recombined during library construction (data not shown). As the nucleotide sequence around the point of sequence divergence (CAG/GTAATT, Fig. 2B) perfectly resembles a splice donor site(31) , we wondered whether the clone represents simply an unspliced transcript. To test this hypothesis, Northern blot hybridization was performed using specific PCR amplification products derived from the unique region of each set of cDNAs. As shown in Fig. 3(lanes 9 and 10), the hybridizing transcript (3.3 kb) is smaller than the major brevican transcript of 3.6 kb, indicating that the RNA represented by the 37c2/10 cDNA must be an alternatively processed mRNA species.
The
common protein sequence encoded by both transcripts ends exactly after
the central domain (Fig. 1). The specific portion of the 3.3-kb
transcript extends the reading frame by only 21 additional amino acids
beyond the central domain (Fig. 2B). The extreme
carboxyl terminus of the deduced protein is highly hydrophobic and
resembles the signal for the attachment of GPI anchors(32) .
Indeed, digests with the enzyme PI-PLC confirmed that GPI-anchored
isoforms of brevican exist (see below). Comparison of the GPI anchor
signal with the sequences signaling the addition of GPI-anchors to
other proteins suggest processing and GPI addition to brevican at
Ser-600 (Fig. 2B; (32) ). Thus, the putative
mature GPI-anchored isoform consists of a protein moiety of 600 amino
acids with a calculated M of 64,405.
Antisera were produced against a bacterially expressed glutathione S-transferase-brevican fusion protein comprising amino acids 194-631 of rat brevican, which should recognize polypeptides encoded by both classes of transcripts (compare Fig. 1). Consistent with the Northern data, these antibodies detect brevican immunoreactivity on Western blots of brain protein preparations but not in protein extracts of liver nor of heart tissue (Fig. 4).
Figure 4:
Tissue distribution of brevican
immunoreactivity. Immunoblots containing 15 µg of protein/lane of
the soluble protein fractions (lanes 1-4) or detergent
extracts of crude membrane fractions (lanes 5-7) were
developed using rabbit anti-brevican antiserum. Lane 1,
100,000 g supernatant from brain homogenates; lanes 2-4, chondroitinase ABC-treated 100,000
g supernatant from brain(2) , heart(3) , and
liver(4) . Lanes 5-7, Triton X-100 extracts of
untreated brain(5) , heart(6) , and liver (7) membranes. Apparent molecular weights of major protein
bands are indicated.
Additional brevican immunoreactivity is associated with the crude membrane fraction of brain extracts ( Fig. 4and Fig. 5). Detergent extracts of this fraction contain a diffuse high molecular weight smear and immunoreactive bands of 140, 125, and about 80 kDa (Fig. 4, lane 5). Again, the high molecular weight material is chondroitinase-sensitive (Fig. 5A, lanes 2 and 3).
Figure 5: Identification of GPI-anchored brevican isoforms in the rat brain. A, immunoblots contain chondroitinase ABC-treated soluble protein fraction (lane 1); untreated crude membrane fraction (lane 2); chondroitinase ABC-treated crude membrane fraction (lane 3); supernatant of chondroitinase ABC and PI-PLC-treated membranes (lane 4); pellet of chondroitinase ABC and PI-PLC-treated membranes (lane 5). Lanes 1-3 contain 15 µg of protein/lane; the protein contents of lanes 4 and 5 adds up to 15 µg. B, immunoblots of brevican isoforms released by PI-PLC from chondroitinase ABC-treated crude membranes (lane 1) and synaptic membranes (lane 2). The synaptic membrane-enriched isoform is indicated by an asterisk. Immunoblots were developed using rabbit anti-brevican antibodies.
Chondroitinases ABC and AC (not shown) completely eliminate the high molecular weight material, whereas incubation with heparitinase III does mot affect these high molecular weight forms, neither in the soluble fraction nor in membranes (data not shown), indicating that brevican bears principally chondroitin sulfate side chains.
The crude membrane fraction was digested with PI-PLC to prove the existence of GPI-anchored isoforms. As shown in Fig. 5A (lanes 4 and 5), most of the immunoreactivity is released from the membranes by this treatment. Digestion of the crude membrane fraction with GPI-specific phospholipase C produces similar results (not shown). Incubation of the membranes under identical conditions but without enzyme does not release brevican immunoreactivity into the supernatant. Thus, we conclude that GPI-anchored isoforms of brevican do occur. These isoforms are a 140-kDa protein that is similar in size as the major soluble form, a 125-kDa protein that does not occur in the soluble protein fraction, and an 80-kDa isoform that co-migrates with the upper half of the soluble 70-80-kDa material.
In order to clarify which GPI-anchored brevican isoforms are derived from the 3.3-kb transcript, human embryonic kidney cells 293 were transfected with the 3.3-kb cDNA. On Western blots of isolated crude membranes from untransfected 293 cells, no brevican immunoreactivity is detectable (Fig. 6, lane 1). Stable transfectants express high amounts of brevican. After chondroitinase treatment, three distinct isoforms with apparent molecular masses of 90, 125, and 140 kDa are observed (Fig. 6, lane 2). Crude membrane fractions of these transfectants were treated with PI-PLC to test whether GPI anchoring occurs. Whereas the 140-kDa isoform is not affected by the enzyme, the 90- and 125-kDa isoforms are at least in part released into the supernatant (Fig. 6, lanes 3 and 4). This demonstrates that correct glypiation of the 90- and the 125-kDa isoform can occur in 293 cells.
Figure 6: Expression of GPI-anchored brevican isoforms in stably transfected HEK 293 cells. 15 µg/lane of crude membrane fractions from untransfected (lane 1) or stably transfected 293 cells (lane 2) were treated with chondroitinase ABC and immunoblotted. Lane 3 contains the chondroitinase ABC treated PI-PLC-released supernatant, and lane 4 contains the corresponding pellet from 15-µg crude membrane proteins of transfected cells.
The original rat brevican clone was isolated using antibodies against a synaptic protein preparation. Therefore, we examined the rat brain synaptic membrane fraction for the presence of brevican immunoreactivity. As shown in Fig. 5B, PI-PLC-releasable brevican isoforms are present in this fraction. In addition to the 140-, 125-, and 80-kDa isoforms, a 150-kDa isoform is highly enriched in this membrane preparation. Whether the 150-kDa protein is a synapse-specific brevican isoform or co-purifies with synaptic membranes for other reasons, e.g. because of its biochemical characteristics, remains to be solved.
Figure 7: Developmental expression of soluble (A) and GPI-anchored (B) brevican isoforms. Immunoblots containing 15 µg of soluble protein/lane (A) or the PI-PLC-released supernatant from 15 µg of crude membrane proteins (B) were developed using rabbit anti-brevican antiserum. All samples were treated with chondroitinase ABC. The days of postnatal development are indicated above.
Figure 8: Localization of brevican immunoreactivity in the hippocampus (A and B) and the cerebellum (C) of 30-day-old rats. A distinct extracellular staining is observed around pyramidal cells (Py) of the hippocampal CA3 region (B). In the radial layer (r) and the mossy fiber layer (stratum lucidum, sl) immunoreactivity coats the dendrites of the pyramidal neurons (B). The cerebellar Purkinje cells (Pu) and their primary dendrites spreading out into the molecular layer (m) are intensely stained at their surfaces (arrowheads in C). The extracellular space in the granule cell layer (g) is filled with reaction product (C). The box in A indicates the hippocampal region detailed in B.
This study describes the cloning of the complete coding sequence for two isoforms of a rat member of the aggrecan/versican family of proteoglycan core proteins. One isoform has the features of ECM core proteins that are secreted into the extracellular space, the other one is likely to be attached to the cell membrane via a GPI anchor. Biochemical data demonstrate that both soluble and GPI-anchored isoforms of this proteoglycan exist. Immunohistochemical studies with antisera that recognize both types of isoforms revealed an extracellular localization throughout the brain, suggesting that the protein is indeed a component of the ECM.
The secreted isoform of the rat protein characterized here has 82% sequence identity with bovine brevican(16) . Both proteins are of similar size. Therefore, we assume that the two proteins are orthologs. This assumption is supported by protein data; soluble bovine brevican occurs in two isoforms, a 145-kDa mature form and an 80-kDa amino-terminally truncated form(16) . Antisera against rat brevican fusion proteins also recognize proteins of similar sizes in the soluble fraction of rat brain homogenates. Moreover, bovine brevican as well as the rat protein are brain-specific. Both proteins act as conditional proteoglycans, i.e. they can occur as proteoglycan and as free proteins.
The central domains of the two proteins have substantially diverged. The sequence identity is only 72% as compared with 90 and 84% for the HA binding domain and the carboxyl-terminal homology region, respectively. Attachment of chondroitin sulfate is thought to occur to the central domain. Rat brevican contains four typical GAG attachment sites as defined by (13) . These are at positions Ser-391, Ser-524, Ser-528, and Ser-539. Bovine brevican contains three putative sites at positions homologous to Ser-391, Ser-528, and Ser-539(16) .
Another putative member of the aggrecan/versican family is BEHAB(18) . BEHAB is more than 99% identical with the amino-terminal 363 amino acids of pre-brevican, indicating that both are most likely derived from the same gene and that BEHAB is a carboxyl-terminally truncated form of brevican. However, the existence of BEHAB as an actual protein has not been demonstrated yet. The distribution of BEHAB transcripts in the rat brain as revealed by in situ hybridization(18, 19) is in favorable accord with the distribution of brevican immunoreactivity observed in this study.
Differential RNA processing
and the alternate use of exons have been described as mechanisms to
increase the molecular diversity of ECM components (e.g.(33, 34, 35) ). Our data show that this
is also the case for brevican, where secreted and GPI-anchored isoforms
are synthesized from alternatively processed transcripts of the same
gene. A potential splice donor site located on the 3.3-kb brevican
transcript at the point of divergence from the 3.6-kb transcript
suggests that the glypiation signal may be encoded by a DNA segment,
which is removed as an intron from transcripts for the secreted
brevican isoform. Indeed, the organization of the mouse brevican gene
implies that transcripts for the GPI-anchored isoform are synthesized
by ``read through'' into a 2-kb intron located between the
exons for the carboxyl terminus of the central domain and the EGF-like
domain. ()
Besides alternative processing of transcripts,
the heterogeneity among brevican isoforms may reflect differential
modification of primary translation products. After chondroitinase ABC
digestion of the soluble protein fraction, the two major bands of 140
and 70-80 kDa are likely to represent unprocessed and
amino-terminally processed brevican, respectively(16) . In the
crude membrane fraction, three major PI-PLC-sensitive protein bands of
140, 125, and 80 kDa are detectable. Human embryonic kidney 293 cells
stably transfected with the cDNA of the 3.3-kb transcript express 140-,
125-, and 90-kDa core proteins with the latter two being sensitive to
treatment with PI-PLC. Thus, 293 cells are capable of correctly
inserting brevican isoforms into the plasma membrane via a GPI-anchor.
The 90-kDa isoform synthesized by transfected cells coincides with the
size of the in vitro translation product obtained from in
vitro transcripts of the 3.3-kb cDNA, suggesting that
it represents the primary glypiated translation product. Glypiation is
thought to be an early event in the cascade of post-translational
modifications that occurs in the endoplasmic reticulum(32) . In
the rat brain, only minor amounts of the 90-kDa material are
detectable. The 125-kDa isoforms found in brain membranes and in
transfected cells are both released from membranes by PI-PLC. They may
be identical products. In contrast, the 140-kDa isoforms differ in
their PI-PLC sensitivity; the brain membrane isoform is
PI-PLC-releasable, and the isoform expressed in 293 cells is not. The
reason for this difference is unclear. In the brain, an additional
80-kDa isoform occurs, which is released from membranes by PI-PLC. This
may be a proteolytic degradation product of the larger forms, as the
relative amount of the 80-kDa material varies from preparation to
preparation. Taken together, these data show that multiple isoforms are
derived from the 3.3-kb transcript, although the modifications leading
to the differences in migration in SDS-polyacrylamide gel
electrophoresis are yet unknown.
The brevican fusion construct used for antibody production contains, in addition to the relatively unique central domain, a part of the HA binding region, and the EGF-like repeat, which are reasonably conserved among a variety of ECM proteins and cell surface receptors interacting with HA. Therefore, we cannot completely rule out that some of the immunoreactive bands, e.g. the weakly immunoreactive bands observed in both soluble and membrane fractions, represent other HA-binding proteins cross-reacting with the brevican antisera. Although, based on the sizes of recognized proteins, a cross-reaction with known members of the aggrecan/versican family (the identity of the HA-binding region ranges between 57 and 64%(16) ) can be excluded.
Perhaps the most important finding of this study is the identification of isoforms of brevican that are anchored to the membrane via glypiation. This is the first example of GPI anchoring in the aggrecan/versican family. All members of this family of proteoglycan core proteins thus far reported to be expressed in the central nervous system, i.e. neurocan, versican, CAT-301, the S103L PG, and brevican, are secreted into the extracellular space and appear as soluble proteins upon cell fractionation(9) . They are all chondroitin sulfate-bearing proteins. On the other hand, the two brain proteoglycan core proteins known to be GPI-anchored, glypican (6, 7) and cerebroglycan(8) , also termed M12 and M13(1) , are heparan sulfate proteoglycans.
What could be the biological significance of a glypiated member of the aggrecan/versican family of brain proteoglycans? This mode of anchoring proteins to the cell membrane results in an increased lateral diffusibility along the membrane(36) . It enables the protein to be released from the membrane by extracellular phospholipases in a controlled manner. Glypiated proteins are discussed as markers for physiologically important membrane microdomains(37) . In polarized cells, such as neurons, a precise targeting to specific plasma membrane domains caused by the GPI signal has been observed(38) . In addition, a possible interaction with signal transducing tyrosine kinases was proposed based on immunoprecipitation assays (for review, see (35) ). In essence, GPI anchorage makes proteins good candidates to mediate dynamic remodeling of neuronal membranes (39) receiving signals from the ECM and transducing them along and/or through the membrane. This postulated gain of function for the GPI-anchored as compared with secreted brevican isoforms is accompanied by a loss of possible other functions realized by the specific carboxyl-terminal domains of the extracellular brevican isoforms. This portion of the protein comprising the EGF-repeat, the lectin-like domain, and the complement regulatory protein-like domain is conserved among the other members of the aggrecan/versican family. Possible biological functions of this region may include the recognition of carbohydrate structures and the interaction with other proteins. As a working hypothesis, we can assume that the extracellular brevican isoform is an integral ECM component interacting both with HA via its amino terminus and with other ECM components via its carboxyl terminus, whereas the membrane-linked isoform could function as a chondroitin sulfate-bearing cell surface receptor targeting interaction with HA to specific membrane subdomains.
One such subdomain could be the synapse. Interestingly, the developmental appearance of the glypiated brevican isoforms parallels synaptogenesis in the rat central nervous system. Depending on the brain area, synapse formation starts around birth, culminates between postnatal days 15 and 25, and is basically completed after day P30 (40, 41, 42, 43) . A possible role of particular brevican isoforms in synaptic development and function is also implicated by the fact that (i) the original cDNA clone was isolated using antisera against a rat brain synaptic protein preparation and (ii) a glypiated brevican isoform is enriched in the synaptic membrane fraction. A definitive proof for a synaptic localization of glypiated brevican will require specific antibodies for this particular isoform of brevican.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) X79881[GenBank].