(Received for publication, October 3, 1995; and in revised form, December 28, 1995)
From the
Using autoantibodies from a Sjögren's
syndrome patient, we have previously identified a 230-kDa peripheral
membrane protein associated with the cytosolic face of the trans-Golgi (Kooy, J., Toh, B. H., Pettitt, J. M., Erlich, R.
and Gleeson, P. A.(1992) J. Biol. Chem. 267,
20255-20263). Here we report the molecular cloning and sequence
analysis of human p230 and the localization of its gene to chromosome
6p12-22. Partial cDNA clones, isolated from a HeLa cell cDNA
library using autoantibodies, were used to obtain additional cDNAs,
which together span 7695 base pairs (bp). The p230 mRNA is 7.7
kilobases. Two alternatively spliced mRNAs for p230 were detected.
These differed by 21- and 63-bp insertions in the 3`-sequence,
resulting in differences in amino acid sequence at the carboxyl
terminus. The predicted 261-kDa protein is highly hydrophilic with
17-20% homology with many proteins containing coiled-coil
domains. Apart from two proline-rich regions (amino acids 1-117
and 239-270), p230 contains a very high frequency of heptad
repeats, characteristic of
-helices that form dimeric coiled-coil
structures. p230 also includes the sequence ESLALEELEL (amino acids
538-546), a motif found in the granin family of acidic proteins
present in secretory granules of neuroendocrine cells. This is the
first report of a cytosolic Golgi protein containing a granin motif.
The structural characteristics of p230 indicate that it may play a role
in vesicular transport from the trans-Golgi.
The Golgi apparatus is a highly complex and dynamic organelle organized into three functionally distinct regions: the cis, medial, and trans cisternae of the Golgi stack and two tubulovesicular networks, namely the cis-Golgi network and the trans-Golgi network(1, 2) . Transport of newly synthesized proteins from the endoplasmic reticulum to Golgi cisternae, between adjacent cisternae, and from the cisternae to various destinations is mediated by vesicles shuttling between donor and recipient compartments(3) . Numerous structural and regulatory proteins have been implicated in the budding, docking, and fusion of vesicles(3, 4, 5) .
Soluble
proteins involved in budding of vesicles include COPI and COPII coat
proteins and the small GTP binding protein ARF-1. A N-ethylmaleimide-sensitive fusion protein, soluble N-ethylmaleimide-sensitive fusion protein attachment proteins
(SNAPs) ()and Rabs are involved in either vesicle docking or
membrane fusion(3, 6, 7, 8) . SNAP
receptors (SNAREs), membrane proteins that form oligomeric complexes
with SNAPs and N-ethylmaleimide-sensitive fusion proteins, are
considered to promote fusion of vesicles with target membranes after
specific docking mediated by SNAREs on vesicle and target membranes (3) . However, many facets of the transport process remain
unresolved. For example, the protein coat structures that mediate
forward vesicle transport from the Golgi apparatus have not been fully
characterized.
Additional peripheral membrane Golgi proteins have also been implicated in vesicular transport, for example three high molecular weight proteins in cis to medial Golgi transport(9) . One of these, p115, contains coiled-coil domains and is related to Uso1p required for transport from endoplasmic reticulum to Golgi complex in Saccharomyces cerevisiae(10) . Two peripheral membrane proteins have been implicated in budding of vesicles from the trans-Golgi network, namely p200, which associates with coated vesicles arising from the trans-Golgi network(11) , and p62, which forms a complex with TGN38/41 and Rab6(12) .
Human
autoantibodies are valuable reagents for identification of novel
intracellular proteins. Using anti-Golgi autoantibodies from a patient
with Sjögren's syndrome, we have previously
identified a brefeldin A-sensitive peripheral membrane protein of 230
kDa (p230) localized to the cytosolic face of trans-Golgi(13) . Other novel Golgi proteins have also
been identified using autoantibodies, including golgins-95 and
-160(14) , a protein of 370 kDa(15, 16) ,
which appears to be identical to giantin(17) , and a cis-Golgi network p210 protein(18) . For those Golgi
proteins where sequences are known, a common structural feature is a
high content of predicted coiled-coil domains. Here we have cloned and
sequenced p230. The predicted protein consists of
-helical
coiled-coil domains with abundant heptad repeats and contains a granin
motif shared with proteins found in secretory vesicles. We propose that
p230 has a role in membrane transport of proteins from the Golgi
apparatus.
Figure 1:
Intracellular
localization of autoantigens by indirect immunofluorescence. Human Hep2
cells were stained by indirect immunofluoresence with autoimmune serum
1 (a) and autoimmune serum 2 (b) or with
autoantibodies from serum 1 eluted from clone g5 (c) or
from an irrelevant
gt11 clone (d). Magnifications: a,
1000; b,
400; c,
800; d,
500.
Figure 2:
Immunoblot analysis of cell extracts and
bacterial fusion protein with autoimmune serum 2. HeLa cell proteins (A) and total proteins from
isopropyl-1-thio--D-galactopyranoside-treated E. coli DH1 cells transformed with recombinant pGEX-clone
g5
autoantigen (B) were separated under reducing conditions on a
5 or 7.5% polyacrylamide gel, respectively, and transferred to
nitrocellulose membranes. Membranes were incubated with autoimmune
serum 2 (AIS) or normal human serum (NHS) followed by
peroxidase-conjugated anti-human immunoglobulin. Bound immunoglobulin
was detected by Enhanced Chemiluminesence.
Figure 3:
Map of p230 cDNA clones. Clones g2,
g5,
g7, and
g12 were isolated from a
gt11 HeLa cell
cDNA library by immunoscreening with autoimmune serum 1. A
ZAP
hepatoma cDNA library was screened with
g5 yielding clones
z2
and
z6; screening the same library with
z6 resulted in clones
z7a and
z8a, which also showed positive hybridization with
g5. The
ZAP Hepatoma cDNA library was also screened with a
300-bp fragment of
g7 resulting in
z7,
z8, and
z16,
while a 45-bp segment of
z6 identified
z9. Finally, a
PCR-generated fragment of the 5` region of
z9 was used to screen a
pUEX HeLa cell cDNA plasmid library, identifying clone
px1.
The four
clones (g2,
g5,
g7, and
g12) showed an overlapping
nucleotide sequence spanning approximately 2.0 kb (Fig. 3). To
obtain additional sequence, we recreened a
ZAP cDNA hepatoma
library using clone
g5 as probe. Two further clones, designated
z2 and
z6, were isolated (Fig. 3). Further screening
of the same library with
z6 resulted in clones
z7a and
z8a. These latter clones also gave positive hybridization signals
with clone
g5, and DNA sequencing confirmed their overlapping
sequence. Screening the
ZAP hepatoma library with a 300-bp HincII fragment of
g7 identified clones
z7,
z8,
and
z16, and screening with a 45-bp segment of
z6 identified
z9 (Fig. 3). The 5`-sequence was obtained from clone px1,
isolated by screening a randomly primed HeLa cell plasmid cDNA library
with a PCR-generated 240-bp fragment of the 5`-region of
z9.
Together, all of the clones (Fig. 3) comprise 7.7 kb of cDNA, as
determined by nucleotide sequencing. Three of the cDNA clones, namely
px1,
z9, and
z16, collectively span the entire 7.7-kb cDNA.
Northern analysis showed that the three clones, px1, z9, and
z16 all hybridized with a similar sized transcript from HeLa cell
poly(A)
RNA of about 7.7 kb (Fig. 4a).
To confirm that the overlapping clones shown in Fig. 3were
derived from the same transcript, reverse transcriptase PCR was carried
out using total RNA isolated from HeLa cells and oligonucleotide
primers as indicated in Fig. 4b. A 3.2-kb product was
obtained using primers P1 and P2, and a 4.7-kb product was generated
using primers P3 and P4; the sizes of these PCR products were in
accordance with that expected from the nucleotide sequence of the
clones. The identity of the PCR products was confirmed by Southern blot
analysis using internal probes. The P1/P2 product was probed with a
1.3-kb fragment from clone px1 and the P3/P4 product with the above
mentioned 45-mer (Fig. 4b). Taken together, our data
imply that we have isolated a full-length cDNA encoding p230.
Figure 4:
A,
Northern blot analysis. HeLa cell poly (A) RNA was
size fractionated by formaldehyde gel electrophoresis, transferred to
Hybond N
membranes and hybridized with
P-labeled cDNA from clone px1. After washing the membrane
and visualizing the signal by autoradiography, the
P-labeled DNA was stripped from the membrane, as described
under ``Experimental Procedures,'' and the membrane reprobed
with
P-labeled DNA from clone
z9. A separate membrane
was probed with
P-labeled DNA from clone
z16. B, reverse transcriptase PCR and Southern blot analysis. The
three clones, px1,
z9, and
z16, which span the full length of
the p230 cDNA overlap as indicated. To determine if the three clones
are derived from the same mRNA, total RNA from HeLa cells was reverse
transcribed, and the cDNA amplified using oligonucleotide primers P1
and P2 or P3 and P4. The expected sizes of the products are indicated.
Incubations were also carried out in the absence of either RNA or
reverse transcriptase. PCR products were analyzed by Southern blotting
using independent internal cDNA probes, as described under
``Experimental Procedures.'' The generation of PCR products
of the expected size confirms the relationship of these overlapping
clones.
The
clones shown in Fig. 3gave identical sequences in the overlap
regions. The nucleotide sequence reported here was verified by either
sequence of both strands of a cDNA clone, identical sequences from at
least one other independent clone, or from reverse transcriptase PCR
products. However, two small differences were detected in the three 3`
clones. These differences comprised a 21-bp stretch (nucleotide
6592-6612) present in clones z7 and
z8, but absent in
z16 and a 63-bp stretch (nucleotide 6950-7012) present in
z7 but absent from clones
z16 and
z8 (Fig. 5).
The absence of the 21 nucleotides does not disrupt the open reading
frame but results in the loss of the amino acids VTIMELQ, while the
absence of the 63 nucleotides results in insertion of an alternative
stop codon and the amino acid sequence of p230 ending in SWLRSSS rather
than FTSPRSGIF. Reverse transcriptase PCR using oligonucleotide primers
P4 and P5 resulted in two products of 1.5 and 1.6 kb, sizes consistent
with the 63- and 21-bp insertion and deletion (Fig. 6). There is
a unique HincII site in the 63-bp insertion, and as expected,
the 1.6-kb PCR product was susceptible to HincII digestion,
giving the expected fragments of 0.98 and 0.64 kb. These results
indicate the presence of alternatively spliced mRNAs derived from the
same gene. Whether these different mRNAs represent regulated splicing
events or products from inaccurate splicing is not known.
Figure 5:
Identification of two p230 mRNA species. A, nucleotides 6592-6612 are present () in clones
z7 and
z8, but absent (
) from clone
z16.
Nucleotides 6950-7012 are present (
) in clone
z7, but
absent (
) from clones
z16 and
z8. The sequence from
6950-7012 (
) contains an in-frame TGA stop codon and a HincII restriction site. P5 (18-mer) and P4 (24-mer) are
oligonucleotides used for reverse transcriptase PCR to assess the
presence of more than one p230 transcript. The expected sizes of the
PCR products are indicated. B, total RNA from HeLa cells was
reverse transcribed using oligo(dT), and the resulting cDNA were
amplified using primers P4 and P5. The PCR product was divided into
two, with one-half digested with HincII. Samples were
separated by agarose gel electrophoresis and visualized by ethidium
bromide staining. Incubations were also carried out in the absence of
either RNA or reverse transcriptase, as
indicated.
Figure 6: cDNA and predicted amino acid sequence of p230. Nucleotide sequence of p230 is shown, together with the predicted sequence of the encoded polypeptide. Proline-rich regions of the predicted polypeptide (see text; domains 1 and 3) are shaded and the granin signature is boxed.
The open reading frame encodes a putative polypeptide of 2230 amino acids with a predicted molecular weight of 261,126 Da and a pI of 5.16. The amino acid composition of this putative polypeptide is rich in glutamic acid (16.6%), lysine (12.7%), leucine (12.5%), and glutamine (10.1%) residues. Comparison with the GenBank(TM) data base reveals regions of p230, which are identical or nearly identical to 15 previously isolated partial cDNA clones of no known function. A list of these loci together with the corresponding regions of identity in p230 is given in Table 1.
Comparison of the translated amino acid sequence of p230 with the translated GenBank(TM) data base reveals significant homology (17-27% identity) with many proteins known or predicted to encode coiled-coil domains. These include various conventional and nonconventional myosins, tropomyosins, cytokeratins, vimentin, neurofilaments, laminin, hemeolytic streptococcal M proteins, dystrophin, the Golgi proteins golgin-95, golgin-160, and giantin, the Uso1 protein of S. cerevisiae implicated in endoplasmic reticulum to Golgi transport and its mammalian homologue p115 or TAP, and the early endosome-associated protein EEA-1(35) . In addition, this homology was shared with other structural and motor proteins of the cytoskeleton including human cytoplasmic linker protein-170, kinetocore protein CENP-E, lamins, dynein, and kinesins. In general, this pattern of identity was scattered throughout the length of p230, although similar comparisons with translated GenBank(TM) sequences using a series of overlapping segments of p230 revealed that such homology was less pronounced in the amino-terminal 130 amino acids of p230.
There are two proline-rich domains at the amino terminus of the putative polypeptide; amino acids 1-117 contains 6.8% proline residues, while the segment from amino acids 239-270 contains 18.8% proline residues (Fig. 6), suggesting a compact structure for these domains. The remainder of the protein has scant proline residues, most of which are clustered at the extreme carboxyl terminus. Thus, p230 can be divided into four putative domains on the basis of its proline distribution, i.e. domain 1 (amino acids 1-117), domain 2 (amino acids 118-238), domain 3 (amino acids 239-270), and domain 4 (amino acids 271-2230).
These analyses are consistent with the two proline-rich
domains having a high probability of globular structures and an
elongated structure for the intervening domains 2 and 4(36) .
These features are supported by results of secondary structure
predictions (Fig. 7), which suggest a predominantly
-helical structure for p230 with the exception of the proline-rich
domains 1 and 3. Analyses of hydrophilicity (Fig. 7) suggest a
predominantly hydrophilic structure, with no evidence for a hydrophobic
transmembrane domain, consistent with biochemical data reported by Kooy et al.(13) . Charge plots (not shown) show no evidence
for discrete acidic or basic domains.
Figure 7:
Secondary structure predictions for
p230. Top panel, hydrophilicity profile of p230 using
Kyte-Doolittle hydropathy scale with a window of 7(49) .
Positive values denote hydrophilic regions that may be exposed on the
outside of p230. Bottom panel, a summary of secondary
structure predictions using the methods of Chou-Fasman (50, 51, 52, 53) (CF, light shaded bars) and Robson-Garnier(54, 55) (RG, dark shaded bars) with a
hydrophilicity window size of 11 is shown, together with a composite
where both Chou-Fasman and Robson-Garnier predictions are in agreement (CfRg, black shaded bars). The presence of a bar
indicates regions of the protein predicted to form -helices,
-pleated sheet, or reverse turns by each of the methods used.
These plots were generated using the MacVector program (International
Biotechnologies Inc.).
The above features raised the possibility that p230 adopts a coiled-coil structure, stabilized by heptad repeats. A search for these structures was performed using the method of Lupas et al.(26) . The results of this analysis (Fig. 8a) reveal an extraordinarily high level of heptad repeats in domains 2 and 4, which predict a coiled-coiled structure with a high degree of confidence. Detailed sequence analysis of the longest of these regions is shown in Fig. 8b, which shows a run of 31 heptad repeats extending over 245 amino acids with four heptad frame-shifts. In common with various fibrous coiled-coil proteins(37, 38) , this region shows a high frequency of apolar residues in positions a (75%) and d (52%) of the heptads and the absence of acidic residues in these positions. The preference for leucine over isoleucine at the d position suggests that the protein has a dimeric rather than a trimeric quaternary structure(39, 40) . This region also shares with the fibrous proteins a high frequency (54.9%) of charged residues in the outer positions (b, c, e, f, and g), with only 14.3% of residues in these positions being apolar. However, in contrast to the fibrous proteins, this region has a relatively high number of lysine and arginine residues in the d position, which, together with the concomitant reduction in apolar residues at this position and the presence of other discontinuities including frameshifts and ``stutter residues,'' would be expected to confer marginal stability on the coiled-coil(36) .
Figure 8: A, prediction of coiled-coil segments of p230. The lower panel is a histogram of the probability of forming a coiled-coil structure according to the method of Lupas et al.(26) . p > 0.9 are significant. Bars above the histogram indicate which of the possible seven ``frames'' the heptad repeats follow for each region of the protein with P>0.5 for formation of a coiled-coil structure. The presence of numerous ``frameshifts'' provides evidence of discontinuities in the coiled coil structure (see text). B, primary sequence of the longest uninterrupted region of heptad repeats from p230. The amino acid sequence of p230 from position 1460 to 1704 is plotted to show the position of each residue within a heptad repeat of the form abcdefg(38) . Apolar residues at positions a and d are shown in boldface.
p230 has multiple consensus motifs for tyrosine phosphorylation,
protein kinase C phosphorylation, casein kinase II phosphorylation,
cAMP/cGMP phosphorylation, N-myristoylation, and N-glycosylation. It is not known at present how many of these
motifs are utilized; however, p230 appears to be devoid of N-glycans(13) , consistent with location of the
protein on the cytoplasmic face of Golgi membranes. p230 also contains
between amino acids 538 and 546 the sequence ESLALEELEL, a motif found
in otherwise diverse members of the granin (chromogranin/secretogranin)
family of acidic proteins found in the secretory granules of
neuroendocrine cells. A search of the translated GenBank(TM) data
base revealed that in addition to chromogranins, this region and its
immediate flanking sequence showed homology with a number of proteins
involved in subcellular compartmentalization and motor functions,
including flagellin, calnexin, the Golgi protein giantin,
8-tubulin, neurofilament L, caldesmon, chromokinesin, and with the
human microtubule-associated protein E-MAP-115. No known microtubule
binding motifs were found in p230.
Figure 9:
Chromosomal localization of human p230
gene. A, diagram showing the grain distribution in 70
metaphase spreads on one slide following hybridization with H-labeled DNA from clone
g5; B, an
enlargement of grains recorded on chromosome 6 together with
photographs showing silver grains on G-banded chromosome
6.
The cDNA clones we have isolated encode the full-length p230
Golgi protein for the following reasons. First, autoantibodies,
affinity-purified from clones g2,
g5,
g7, and
g12,
gave immunofluorescence staining of the Golgi apparatus of Hep2 cells.
Second, rabbit antibodies raised against a bacterial fusion protein
incorporating clone
g5 not only stained the Golgi apparatus by
immunofluorescence and immunoelectron microscopy, but also
immunoblotted and immunoprecipitated a 230-kDa protein from HeLa cells (13) . Third, cDNAs of clones px1,
z9, and
z16
together span 7.7 kb, in agreement with the size of the mRNA obtained
from Northern blots; and fourth, reverse transcripase PCR demonstrated
that these three clones were derived from the same transcript.
The deduced amino acid sequence of p230 suggests a hydrophilic, modestly acidic protein capable of forming a dimeric coiled coil structure. While most (>90%) of the protein is predicted to form this structure, the extreme amino-terminal 130 amino acids and a segment between amino acids 239 and 270 are predicted to form compact structures consistent with globular regions. While it is well established that stable static coiled coils can serve as multimerization motifs in structural proteins, dynamic coiled-coil formation can play a central role in generation of conformational changes resulting in dramatic movements of one part of a protein relative to another. Such coiled-coil regions have been implicated in the function of the nonclaret disjunctional kinesin-related microtubule motor protein, which translocates on microtubules toward their minus ends and is required for proper chromosome segregation in Drosophila oocytes(41) . This protein has a central stalk region consisting of heptad repeats predicted to form coiled-coils. A mutant that lacks the amino-terminal third of the coiled-coil stalk exhibits partial loss of function, having a translocation velocity and torque generation similar to wild-type protein, but only partially rescues a null mutant for chromosome missegregation. A similar effect has been reported for a cytoplasmic myosin II protein partially deleted for its coiled-coil tail(42) . In this case, the mutant protein is expressed at a level comparable with the wild-type protein and translocates on actin filaments in vitro with the same velocity as wild-type protein, but in spite of this, exhibits frequent failure of cytokinesis in vivo. Of particular interest is the loss of function reported for partial deletions of the coiled-coil domain of the yeast Uso1 protein(43) , a protein involved in vesicular transport, where such mutations are temperature-sensitive lethal, resulting in a severe defect of endoplasmic reticulum to Golgi protein transport at the nonpermissive temperature. It has been suggested that a key structural feature of coiled-coils that participate in conformational changes is the presence of regions of marginal stability caused by discontinuities in heptad repeats of the coiled coil as a result of deletions, insertions, or out-of-frame residues(44) . The presence of extensive regions of this type in p230 raises the possibility of dynamic coiled-coil formation, which could play a role in regulation of multimerization or in induction of conformational changes in the protein. Dramatic conformational changes have been found to occur in other coiled-coil proteins involved in membrane fusion events (for review, see (44) ).
p230 also has a short region of homology to the conserved carboxyl-terminal domain of the granin (chromogranin/secretogranin) family of proteins, a diverse group of acidic proteins present in secretory granules of endocrine and neuroendocrine cells (for review, see (45) ). Granins are suggested to be precursors of several peptide hormones and to regulate proteolytic processing and selective aggregation of secretory proteins in the trans-Golgi network of neuroendocrine cells(46) . Only one short region at the carboxyl terminus is shared among the granins. This region bears the consensus sequence E(N/S)LX(A/D)X(D/E)XEL which is closely related to a sequence found in p230 (Fig. 6). A short region of the carboxyl terminus of chromogranin A, containing the granin motif, may be responsible for pH-regulated multimerization of the protein (47) and, together with the pH-dependent association of chromogranin A with integral membrane proteins of the secretory vesicle, suggests a role for chromogranins A in the sorting of these membrane proteins during vesicle biogenesis in the trans-Golgi network(48) . However, the relevance of this motif in p230 is unclear since p230 is orientated on the cytosolic side of Golgi membranes.
There is an increasing number of proteins implicated in
vesicular transport that have extensive coiled-coil domains. These
domains have potential for dynamic interactions associated with this
highly complex process. The potential dynamic coiled-coil structure of
p230, its localization to the trans-Golgi network, ()and sensitivity to brefeldin A (13) suggest that
p230 may have a key role in vesicular transport from this distal Golgi
compartment.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U41740[GenBank].