Multiple Isoforms of Heparan Sulfate D-Glucosaminyl 3-O-Sulfotransferase
ISOLATION, CHARACTERIZATION, AND EXPRESSION OF HUMAN cDNAs AND IDENTIFICATION OF DISTINCT GENOMIC LOCI*

Nicholas W. ShworakDagger §, Jian LiuDagger parallel , Lorin M. PetrosDagger , Lijuan ZhangDagger **, Masashi KobayashiDagger , Neal G. CopelandDagger Dagger , Nancy A. JenkinsDagger Dagger , and Robert D. RosenbergDagger §

From the Dagger  Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, the § Department of Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, Massachusetts 02215, and the Dagger Dagger  Mammalian Genetics Laboratory, ABL-Basic Research Program, National Cancer Institute, Frederick Cancer Research and Development Center, Frederick, Maryland 21702

    ABSTRACT
Top
Abstract
Introduction
References

3-O-Sulfated glucosaminyl residues are rare constituents of heparan sulfate and are essential for the activity of anticoagulant heparan sulfate. Cellular production of the critical active structure is controlled by the rate-limiting enzyme, heparan sulfate D-glucosaminyl 3-O-sulfotransferase-1 (3-OST-1) (EC 2.8.2.23). We have probed the expressed sequence tag data base with the carboxyl-terminal sulfotransferase domain of 3-OST-1 to reveal three novel, incomplete human cDNAs. These were utilized in library screens to isolate full-length cDNAs. Clones corresponding to predominant transcripts were obtained for the 367-, 406-, and 390-amino acid enzymes 3-OST-2, 3-OST-3A, and 3-OST-3B, respectively. These type II integral membrane proteins are comprised of a divergent amino-terminal region and a very homologous carboxyl-terminal sulfotransferase domain of ~260 residues. Also recovered were partial length clones for 3-OST-4. Expression of the full-length enzymes confirms the 3-O-sulfation of specific glucosaminyl residues within heparan sulfate (Liu, J., Shworak, N. W., Sinay, P., Schwartz, J. J. Zhang, L., Fritze, L. M. S., and Rosenberg, R. D. (1999) J. Biol. Chem. 274, 5185-5192). Southern analyses suggest the human 3OST1, 3OST2, and 3OST4 genes, and the corresponding mouse isologs, are single copy. However, 3OST3A and 3OST3B genes are each duplicated in humans and show at least one copy each in mice. Intriguingly, the entire sulfotransferase domain sequence of the 3-OST-3B cDNA (774 base pairs) was 99.2% identical to the same region of 3-OST-3A. Together, these data argue that the structure of this functionally important region is actively maintained by gene conversion between 3OST3A and 3OST3B loci. Interspecific mouse back-cross analysis identified the loci for mouse 3Ost genes and syntenic assignments of corresponding human isologs were confirmed by the identification of mapped sequence-tagged site markers. Northern blot analyses indicate brain exclusive and brain predominant expression of 3-OST-4 and 3-OST-2 transcripts, respectively; whereas, 3-OST-3A and 3-OST-3B isoforms show widespread expression of multiple transcripts. The reiteration and conservation of the 3-OST sulfotransferase domain suggest that this structure is a self-contained functional unit. Moreover, the extensive number of 3OST genes with diverse expression patterns of multiple transcripts suggests that the novel 3-OST enzymes, like 3-OST-1, regulate important biologic properties of heparan sulfate proteoglycans.

    INTRODUCTION
Top
Abstract
Introduction
References

Heparan sulfate proteoglycans are hybrid molecules composed of a protein core to which is attached one or more linear glycosaminoglycan chains of the heparan sulfate variety. Extreme structural diversity of the heparan sulfate side chains enables interactions with a broad array of protein effector molecules that modulate a wide range of biologic processes. The specificity of any given heparan sulfate-protein interaction is largely dictated by placement of sulfate groups along the chain's length. Thus the order and ring position of sulfate substituents creates distinct oligosaccharide sequences (fine structure) and defines corresponding biologic activities (reviewed in Refs. 1-3).

The profound functional diversity of heparan sulfate proteoglycans necessitates a mechanism that can generate and independently regulate the production of a myriad of fine structures. Such control is predominantly imposed in a cell type-specific fashion by varying the functional status of the Golgi apparatus, with the core proteins potentially contributing only a minor degree of influence (4, 5). Thus, heparan sulfate biosynthetic enzymes are implicated as key components in generating regions of defined monosaccharide sequence. The production of the antithrombin-binding site by the enzyme heparan sulfate D-glucosaminyl 3-O-sulfotransferase-1 (3-OST-1)1 (EC 2.8.2.23), reveals a mechanism for the independent biosynthesis of a specific heparan sulfate sequence that regulates an important biologic activity.

Antithrombin is a natural anticoagulant that neutralizes serine proteases of the intrinsic blood coagulation cascade through the formation of a 1:1 enzyme-antithrombin covalent complex. The rate of complex formation is dramatically enhanced via interactions with glycosaminoglycans containing the antithrombin-binding site; i.e. pharmaceutical heparin and anticoagulant heparan sulfate. The latter is generated by endothelial cells, which line the blood vessel wall. The importance of the anticoagulant heparan-antithrombin interaction is evidenced by the arterial thrombotic events that occur in patients producing antithrombin variants defective in anticoagulant heparan binding (reviewed in Ref. 6). Given this critical role, it is not surprising that the cellular production of anticoagulant heparan is regulated independently of the general bulk of heparan sulfate (5, 7).

Antithrombin specifically recognizes the structure: -Glc(NS or Ac)6S-GlcA-GlcNS3S±6S-IdoA2S-GlcNS6S-,2 which triggers a conformational change that results in the accelerated neutralization of specific coagulation proteases (reviewed in Refs. 3 and 8). The central 3-O-sulfate group is absolutely essential for induction of the conformational change and high affinity antithrombin binding. Binding specificity additionally requires the 6-O-sulfate groups on residues 1 and 5, the amino group at residue 5 and carboxyl groups at other sites (9). The critical role of the 3-O-sulfate group and the extreme paucity of this substituent within heparan sulfate (5, 7), raises the possibility of a key regulatory role. Indeed, we have recently demonstrated that the enzyme 3-OST-1 performs the rate-limiting biosynthetic reaction that determines cellular production of anticoagulant heparan (10, 11). The enzyme recognizes a specific precursor structure, corresponding to the antithrombin-binding site devoid of just the 3-O-sulfate, and adds this rare substituent to complete the formation of anticoagulant heparan (10). Thus, 3-OST-1 activity controls cellular anticoagulant phenotype. This example raises the possibility that additional heparan sulfate biosynthetic enzymes may function in an analogous fashion, controlling production of other important heparan sulfate fine structures.

The molecular cloning of the cDNA for the precursor protein of 3-OST-1 showed that the enzyme undergoes removal of an amino-terminal leader sequence to generate a Golgi intraluminal resident of ~290 amino acids (12). Most importantly, the carboxyl-terminal ~260 residues have striking homology to the comparable region of the bifunctional biosynthetic enzymes heparan sulfate N-deacetylase/N-sulfotransferase-1 and -2 (~50% similarity to both NST-1 and NST-2), and at least 30% similarity to virtually every type of sulfotransferase enzyme previously identified. Consequently, this conserved structure that spans the majority of 3-OST-1 has been presumptively designated as the sulfotransferase domain (12).

In the present article, we have employed this conserved structure to molecularly clone related cDNAs, which encode homologous carboxyl-terminal sulfotransferase domains but distinct amino-terminal structures. Unlike 3-OST-1, the novel 3-OST-2, 3-OST-3A, and 3-OST-3B enzymes are predicted to have type II integral membrane architecture. An incomplete cDNA encoding the sulfotransferase domain portion of the enzyme 3-OST-4 was also obtained. Comparison of the enzyme structures predicts motifs that may govern sequence specific modification. Northern hybridizations show isoform-specific expression patterns, whereas genomic characterizations identified at least 7 human 3-OST genes. Thus, the 3-OST multigene family is exquisitely suited to encode key enzymes that regulate the production of many distinct heparan sulfate fine structures.

    EXPERIMENTAL PROCEDURES

Isolation of 3-OST-2, 3-OST-3, and 3-OST-4 cDNA Clones-- The National Center for Biotechnology Information data bank of I.M.A.G.E. Consortium (Lawrence Livermore National Laboratory) expressed sequence tag cDNA clones (13) was probed with the deduced sulfotransferase domain region of mouse 3-OST-1 (12), which identified partial length clones that were obtained from the TIGR/ATCC Special Collection (ATCC). Complete sequencing of the inserts revealed three clone categories: 3-OST-2, I.M.A.G.E. Consortium (Lawrence Livermore National Laboratory) Clone ID c-20d10 (GenBankTM accession number F07258) (14) from a normalized library generated from total brain of a 3-month muscular atrophy female (15); 3-OST-3ACTF, Clone ID 284542 (GenBankTM accession number N71828), from a library of 4 multiple sclerosis lesions isolated from a 46-year-old male (13); and 3-OST-4, Clone IDs HIBCX69 (GenBankTM accession number T33472) from human brain (16), IB727 (GenBankTM accession number T03677) from infant brain (17, 18), 166466 (GenBankTM accession number R88592) from adult brain (13), 23279 (GenBankTM accession number T75445) from infant brain, c-3ie01 (GenBankTM accession number F13088) (from the same library as c-20d10) (14). To obtain full-length clones, we first identified cDNA regions, described below, which would function as isoform-specific probes by hybridizing Southern blots of human genomic DNA with expressed sequence tag fragments 32P-labeled by random priming. The corresponding fragments were used to screen lambda  TriplEx brain and liver cDNA libraries (), as described previously (12). Positive plaques were purified, TriplEx-based plasmids were excised in vivo according to the manufacturers protocol, and inserts were sequenced as described below.

Characterization of cDNA Clones-- The 5' and 3' insert regions were enzymatically sequenced from flanking primer sites of the respective cloning vectors. The remaining sequence of both strands was obtained with internally priming oligonucleotides. Primers were spaced no more than 400 base pairs (bp) apart with a 200-bp offset between + and - strands, thus each nucleotide was detected within 200 bp of a primer. Automated fluorescence sequencing was performed with Perkin-Elmer Applied Biosystems Models 373A and 477 DNA Sequencers. Each reaction typically yielded 400 to 600 bases of high quality sequence.

Computer Analysis of Sequence Data-- DNA sequence files were aligned and compiled with the program Sequencher 3.0 (Gene Codes Corp.). Sequence comparison searches were performed with Gapped BLAST (19) on the data bases of GenBankTM, EMBL, DDBJ, PDB, SwissProt, PIR, dSTS, htgs, and dbEST. The following protein features were predicted with the corresponding programs: secondary structure, PHDsec () (20); hydrophobicity (Kyte Doolittle), DNA Strider 1.0; membrane spanning segments, PHDhtm () (21); O-glycosylation sites, NetOGlyc 2.0 () (22). Polyadenylation signals were detected with the Genefinder package (). All additional manipulations were performed with the University of Wisconsin Genetics Computer Group (GCG) sequence analysis software package.

Nucleic Acid Probes-- cDNA libraries were screened with probes containing both sulfotransferase domain and 3'-untranslated region sequences as follows: a EcoRI/XbaI 1.6-kb fragment isolated from c-20d10 (nucleotides 385-1952 of 3-OST-2), a EcoRI/XbaI 1.1-kb fragment from clone 284542 (1-1152 of 3-OST-3ACTF), and a 921-bp EcoRI/BamHI fragment from clone 23279 (1-920 of 3-OST-4). Southern analysis was performed with the following probes which contain only sulfotransferase domain sequences: ST-1, a KpnI/BspHI 558-bp fragment from pJL30 (12) (428-985 of human 3-OST-1); ST-2, a 521-bp polymerase chain reaction product from pJL-2.7 (683-1203 of 3-OST-2); ST-3, a 192-bp polymerase chain reaction product from pJL-3.7 (1712-1903 of 3-OST-3A and 1199-1390 of 3-OST-3B); ST-4, a BstXI 575-bp fragment from clone 23279 (156-730 of 3-OST-4); and mST-1, a EcoRI/SacII 779-bp fragment from pNWS182 (12) (395-1173 of mouse 3-OST-1). Southern analysis also included polymerase chain reaction-generated probes containing predominantly SPLAG domain (Ser, Pro, Leu, Ala, and Gly enriched domain) sequences (SPLAG-A is 385 bp from 823-1207 of 3-OST-3A, and SPLAG-B is 271 bp from 448-718 of 3-OST-3B) as well as probes containing only 3'-untranslated regions (3'A is 290 bp from 2006-2295 of 3-OST-3A, 3'B is 309 bp from 1511-1819 of 3-OST-3B, 3'BCTF and 364 bp from 718-1081 of 3-OST-3BCTF). Northern analysis for 3-OST-2 and 3-OST-4 was performed with the same probes as was for library screening. Northern analysis for the 3-OST-3 species included the above described sulfotransferase domain, SPLAG domain, and 3'-untranslated region-specific probes. All samples were random prime labeled with [alpha -32P]dCTP for GC-rich probes or [alpha -32P]dATP for AT-rich probes, as described previously (12).

Southern Blot Analysis-- Genomic DNA from human endothelial cells was used for the analysis of human 3OST gene copy number and for genomic restriction mapping studies. Genomic DNA was isolated from 76 plates (150 mm) of primary human umbilical vein endothelial cells (Clonetics Corp., San Diego, CA) grown according to the suppliers protocol. Cells were harvested by trypsinization, washed in phosphate-buffered saline, and pelleted by centrifugation at 1000 × g for 3 min. Cells were lysed by vortexing for 5 min in ice-cold 140 mM NaCl, 1.5 mM MgCl2, 10 mM Tris, pH 8.0, and 0.5% Triton X-100. Nuclei were collected by centrifugation at 1500 × g for 4 min and resuspended in 3 ml of 150 mM NaCl, 25 mM EDTA, and 10 mM Tris, pH 8.0, then combined with 133 µg of RNase A and 333 µg of proteinase K. Nuclei were lysed with the addition of 3 ml of 0.4% SDS and then incubated at 65 °C for 16 h. Samples were extracted 5 times against 8 ml of phenol, chloroform, isoamyl alcohol (25:24:1), the aqueous phase was combined with 15 ml of isopropyl alcohol, DNA was harvested by spooling, washed with 80% ethanol, and resuspended in 4 ml of 10 mM Tris, pH 8.0, 1 mM EDTA. Copy number determinations for the mouse 3Ost genes were performed on genomic DNA isolated by the above procedure from the previously described clonal mouse L cell line LTA (10, 23, 24). Dr. Chao Sun (Whitehead Institute) generously provided human DNA samples isolated from peripheral leukocytes of 16 unrelated male individuals for the analysis of the BstXI pattern generated by 3-OST probes.

Typically, 10-µg samples were restriction digested, resolved by 0.8% or 1% agarose gel electrophoresis, and transferred to GeneScreen Plus membranes. Membranes were hybridized for 16 h in 1% SDS, 10% dextran sulfate buffer containing SSC and formamide. The concentrations of these latter two components were adjusted so that all homologous hybridizations were incubated at Tm - 25 °C and all heterologous hybridizations were incubated at Tm - 35 °C, where for a DNA:DNA hybrid Tm = 81.5 °C + (16.6 × log([Na+])) + (41 × percentage GC) - (500 div  length of probe template in bp) - (62 × percentage formamide) (25). Membranes were washed in 1% SDS and sufficient SSC to generate a final incubation stringency of at least Tm - 25 °C or Tm - 35 °C, as described above, respectively.

Northern Blot Analysis-- Tissue and cell type-specific expression of 3-OST forms was performed with human multiple tissue and human cancer cell line Northern blots, respectively (). Endothelial expression of 3-OST forms was tested on 10 µg of total RNA from immortalized rat fat pad endothelial cells and primary mouse cardiac microvascular cells, as well as 5 µg of poly(A)+ prepared from primary human umbilical vein endothelial cells, as described previously (12). Samples were resolved on 1.2% formaldehyde-agarose gels and capillary transferred to GeneScreen Plus membranes. Membranes were hybridized for 16 h in 1% SDS, 10% dextran sulfate buffer containing SSC and formamide. The concentrations of these latter two components were adjusted so that all homologous hybridizations were incubated at Tm - 25 °C and all heterologous hybridizations were incubated at Tm - 35 °C, where for a DNA:RNA hybrid Tm = 79.8 °C + (18.5 × log([Na+])) + (58.4 × percentage GC) + (11.8 × (percentage GC)2- (820 div  length of probe template in bp) - (50 × percentage formamide) (26). Membranes were washed in 1% SDS and sufficient SSC to generate a final incubation stringency of at least Tm - 25 °C or Tm - 35 °C, as described above, respectively.

Interspecific Mouse Back-cross Mapping-- Interspecific back-cross progeny were generated by mating (C57BL/6J × Mus spretus)F1 females and C57BL/6J males as described (27). A total of 205 N2 mice were used to map the 3Ost loci, as described under "Results." DNA isolation, restriction enzyme digestion, agarose gel electrophoresis, Southern blot transfer, and hybridization were performed essentially as described (28). All blots were prepared with Hybond-N+ nylon membrane (Amersham). The employed hybridization probes are described above. The 3Ost1 probe, mST-1, was labeled with [alpha -32P]dCTP using a random primed labeling kit (Stratagene); washing was done to a final stringency of 1.0 × SSCP, 0.1% SDS, 65 °C. A fragment of 6.3 kb was detected in ScaI-digested C57BL/6J (B) DNA and a fragment of 8.4 kb was detected in ScaI-digested M. spretus (S) DNA. The 3Ost2 probe, ST-2, detected major BglI fragments of 23.0 (B) and 16.0 (S) kb. The 3Ost3a probe, SPLAG-A, detected major ScaI fragments of 16.5 and 5.4 (B) and 16.5 and 7.0 kb (S). The 3Ost3b probe, SPLAG-B, detected major HincII fragments of 18.0 and 5.0 kb (B) and 9.0 and 5.0 kb (S). Finally, the 3Ost4 probe, ST-4, detected HindIII fragments of 1.7 (B) and 2.4 (S) kb. The presence or absence of the M. spretus-specific fragments was followed in back-cross mice.

A description of the probes and restriction fragment length polymorphisms for the loci linked to the 3Ost genes has been reported previously. These include Adra2c, Msx1, and Bst1, chromosome 5 (29)3; Pth, Pkcb, Spn, and Mgmt, chromosome 7 (29, 31); and Adra1a, Csfgm, Myhsf1, and Trp53, chromosome 11 (32, 33). Recombination distances were calculated using Map Manager, version 2.6.5. Gene order was determined by minimizing the number of recombination events required to explain the allele distribution patterns.

Chromosomal Mapping of Human 3OST Genes-- Data base searching identified bacterial artificial chromosome clones containing human 3OST2, 3OST3A1, and 3OST3B1 genes (GenBankTM accession numbers AC003661, AC002287, AC005411, AC005375, AC005224). Data base searching with a combination of genomic and cDNA sequences identified expressed sequence tag and sequence-tagged site markers (GenBankTM accession numbers G24436, T03677, G21216, and G03581). The chromosomal location of these markers was then determined through the Human Genome Sequencing Index ().

    RESULTS

Isolation and Characterization of cDNAs Encoding 3-OST-2, 3-OST-3, and 3-OST-4 Isoforms

We probed the National Center for Biotechnology Information data base of expressed sequence tag cDNA clones (13) with the deduced amino acid sequence of the presumptive sulfotransferase domain from the human 3-OST-1 cDNA to reveal several human partial length cDNAs encoding novel related species, as described under "Experimental Procedures." Sequencing the contained cDNA inserts confirmed three distinct forms designated as 3-OST-2, 3-OST-3ACTF, and 3-OST-4. Isotype-specific probes were generated, 32P-labeled, and screened against lambda  TriplEx human cDNA libraries made from brain (3 × 106 plaques for 3-OST-2 and 3-OST-4) and liver (4.5 × 106 plaques for 3-OST-3) to identify 7, 8, and 4 additional clones of 3-OST-2, -3, and -4 groups, respectively. The contained inserts of the corresponding isolates were completely sequenced, which revealed two forms for 3-OST-2 (-2 and -2CTF) and 4 kinds of 3-OST-3 cDNAs (-3A, -3ACTF, -3B, and -3BCTF) (Fig. 1).4 The 3-OST-4 clones overlapped with clone 23279, but were all shorter partial-length clones and so are not presented; thus, the longest obtained 3-OST-4 cDNA contains an incomplete coding sequence. The primary structures of 3-OST-2, -3A, and -3B composite cDNAs are presented in Figs. 2, 3, and 4; whereas the sequence data for the incomplete 3-OST-4 cDNA can be obtained from the GenBankTM /EMBL Data Bank. The accompanying article (34) describes the analysis of recombinantly expressed 3-OST-2, -3A, and -3B cDNAs, which confirms that the encoded enzymes specifically 3-O-sulfate glucosaminyl residues within heparan sulfate.


View larger version (19K):
[in this window]
[in a new window]
 
Fig. 1.   cDNAs encoding 3-OST isoforms. Schematic representation of four distinct 3-OST composite cDNAs with boxes representing the protein coding region. Protein regions homologous to the 3-OST-1 putative sulfotransferase domain are cross-hatched; whereas, the nonhomologous amino-terminal coding regions encompass cytoplasmic (stippled), hydrophobic (inverse stippled), and SPLAG- (Ser, Pro, Leu, Ala, Gly enriched) (wavey) domains. K indicates the position of a conserved lysine of presumed catalytic function. Within nucleic acid sequences, the sites of putative polyadenylation signals are shown by the inverted triangles. The sulfotransferase domains of 3-OST-3A and 3-OST-3B cDNAs (between alpha  and gamma ) show nearly identical sequences that differ by only 6 point mutations (bar ) found in all 3-OST-3B clones between points alpha  and beta . Indicated in the middle of each cDNA set are the size and location of individual cDNA inserts, and corresponding plasmids designations. Clones obtained by library screening are designated by the pJL- prefix, whereas clone IDs are given for expressed sequence tag clones. bullet  shows positions of point mutations in 3-OST-2 inserts where the sequence differs from clone pJL-2.7. The bottom of each cDNA set shows the size, location, and designation of hybridization probes.


View larger version (74K):
[in this window]
[in a new window]
 
Fig. 2.   Composite nucleotide and predicted amino acid sequences of 3-OST-2. The cDNA sequence was compiled from the individual 3-OST-2 clones displayed in Fig. 1. The presented structure corresponds to the allelic variant represented by clone pJL-2.7, the four point mutations found in clones pJL-2.6 and c-20d10 are indicated (double underline). Also shown within the nucleic acid sequence are the locations of presumptive polyadenylation signals (single underline). Shown within the amino acid sequence are the hydrophobic region (single underline), the start of the putative sulfotransferase domain (up-arrow ), and predicted sites for O-linked (*), and N-linked (dot underlined boldface type) glycosylations.


View larger version (83K):
[in this window]
[in a new window]
 
Fig. 3.   Composite nucleotide and predicted amino acid sequences of 3-OST-3A. The cDNA sequence was compiled from the individual 3-OST-3A clones displayed in Fig. 1. The sequence starting at position 2315 was appended from the 3-OST-3ACTF clone (previously described under Footnote 4) given that this splice variant contains the complete 3'-untranslated region. Shown within the nucleic acid sequence are positions of point mutations which differ between 3-OST-3A and 3-OST-3B (double underline) and the location of presumptive polyadenylation signals (single underline). down-arrow alpha , down-arrow beta , and down-arrow gamma are described in the text. Shown within the amino acid sequence are the hydrophobic region (single underline), the start of the putative sulfotransferase domain (up-arrow ), and predicted sites for O-linked (*), and N-linked (dot underlined boldface type) glycosylations.


View larger version (77K):
[in this window]
[in a new window]
 
Fig. 4.   Composite nucleotide and predicted amino acid sequences of 3-OST-3B. The cDNA sequence was compiled from the individual 3-OST-3B clones displayed in Fig. 1. Shown within the nucleic acid sequence are positions of point mutations which differ between 3-OST-3A and 3-OST-3B (double underline). down-arrow alpha , down-arrow beta , down-arrow gamma , and are described in the text. Shown within the amino acid sequence are the hydrophobic region (single underline), the start of the putative sulfotransferase domain (up-arrow ), and predicted sites for O-linked (*), and N-linked (dot underlined boldface type) glycosylations.

Table I summarizes the major structural features of all composite cDNA forms. The length of the 5'-untranslated region from the full-length cDNAs varies widely (from 72 to 798 bp) and all ATG codons within this region are followed by in-phase termination codons. For each full-length cDNA, the assigned coding region is by far the longest open reading frame and begins with an initiation ATG conforming to Kozak's consensus sequence (a purine at -3 and/or a G at +4) (35). Moreover, each initiator sequence is precede by one or more in-phase termination codons. A consensus polyadenylation signal (AATAAA) occurs within 20-30 bp of the 3'-untranslated region termini and is followed by a poly(A) tail for all cDNAs except 3-OST-3B (Fig. 1). This distinction indicates the 3-OST-3B composite cDNA contains only an incomplete 3'-untranslated region; especially since the cDNA is 4.2 kb shorter than its corresponding transcript (Table I). For the 3-OST-2 cDNA, an alternate site for polyadenylation is also predicted by an extra signal occurring ~200 bp from the 3'-untranslated region termini.

                              
View this table:
[in this window]
[in a new window]
 
Table I
Comparison of structural features of 3-OST isoform cDNAs

The composite 3-OST-2 cDNA sequence presented in Fig. 2 was derived from clones pJL-2.1, pJL-2.2, pJL-2.3, and pJL-2.7 (Fig. 1). However, clones c-20d10 and pJL-2.6 both differ from the composite cDNA sequence at four positions (G804 right-arrow A, T1249 right-arrow G, T1350 right-arrow C, C1507 right-arrow T) (Figs. 1 and 2). These two clones were isolated from different libraries and so the sequence discrepancies could not have possibly arisen by errors in reverse transcription or cDNA amplification. Given that the human 3OST2 gene is single copy, described below, these differences indicate allelic variation. The G804 right-arrow A transition is the only coding region variant, but does not alter the amino acid sequence. The remaining mutations are found in the 3'-untranslated region; thus, all mutations may be silent.

Most importantly, significant nucleic acid sequence conservation only occurs for the sulfotransferase domain portion of the cDNAs. Within this span, each of the novel cDNAs exhibits ~55% identity to 3-OST-1. However, sulfotransferase domains share ~72% identity between 3-OST-2, -3, and -4 classes. Conservation is most extreme between 3-OST-3A and 3-OST-3B, with 99.2% identity over 774 bp that encodes the entire sulfotransferase domain region of 3-OST-3B (between alpha  and gamma  in Figs. 3 and 4). Immediately after this shared region the 3-OST-3A coding sequence extends two codons (Gly and Stop), whereas the 3-OST-3B cDNA just encodes a Stop codon. Thus, the predicted sulfotransferase domain of 3-OST-3A is 1 amino acid longer than that of 3-OST-3B. The nearly identical regions could have resulted from a single genetic locus by alternative splicing, but only if the nonidentical residues stem from allelic variation. However, this possibility is statistically unlikely (p = 0.016).5 Alternative splicing is completely excluded by genomic restriction mapping, which reveals separate 3-OST-3A and 3-OST-3B genes, as described below (Fig. 6). The profound conservation of a genomic segment between distinct loci is indicative of gene conversion, as described below.

Characterization of 3OST Genomic Loci

Four 3OST3 Genes Exist-- The genomic loci of the various 3OST genes were characterized to identify the origins of these structurally related cDNAs. The copy number of all known 3OST genes was assessed by Southern blot analysis of human genomic DNA with gene specific probes (Fig. 5A). This analysis suggests that 3OST1, 3OST2, and 3OST4 only occur as single copy genes. Heterologous hybridization of the same probes to mouse genomic DNA separately digested with the same 5 restriction enzymes, described under Fig. 5, yielded comparable results (data not shown). The combined analyses strongly argue that both humans and mice possess only single copies of 3OST1, 3OST2, and 3OST4 genes.


View larger version (37K):
[in this window]
[in a new window]
 
Fig. 5.   Determination of 3-OST gene copy number. Southern blots of human genomic DNA. A, BstXI or EcoRI digests were sequentially hybridized to gene-specific probes respectively generated from the sulfotransferase domains of 3-OST-1, -2, -3, and -4, as described under "Experimental Procedures" and shown in Fig. 1 (ST probes). Note that gene copy number could be overestimated if the hybridization region is bisected by an unanticipated intron. To minimize this possibility, relatively short probes (192-575 bp) were selected. Moreover, genomic restriction mapping confirms the target regions of all 3OST3 genes are devoid of latent introns (Fig. 6). The 3-OST-3 probe, ST-3, detects both 3OST3A and 3OST3B bands, as sulfotransferase domain sequences are identical for both genes. The origins of these bands were determined by hybridization to 3'A or 3'B probes (Fig. 1), which are specific to the respective 3'-untranslated regions. Similar data were also generated with genomic DNA digested with BamHI, PstI, and PvuII. B, duplication of amino-terminal sequences was shown by hybridizations with SPLAG-A or SPLAG-B probes (Fig. 1). For each band, the gene(s) of origin were determined by comparing the observed size to that expected from the 3OST3A1 or 3OST3B1 gene sequence. Note that BstXI bisects the SPLAG-B probe, consequently the 3OST3B1 gene generates two bands (3,275 and 314 bp). The absence of a BstXI site in 3OST3B2 shifts the expected lower band up to 1967 bp.

Southern analysis targeting distinct gene regions reveals the human 3-OST-3 multigene subfamily. Sulfotransferase domain sequences common to all 3-OST-3 cDNAs were detected with the probe ST-3, whereas 3'-untranslated regions specific to 3-OST-3A or 3-OST-3B cDNAs were detected with probes 3'A or 3'B, respectively (probe locations shown in Fig. 1). The existence of at least two 3OST3 genes was initially suggested by hybridizations to EcoRI-digested genomic DNA. ST-3 revealed two bands, one exclusively detected by 3'A and the other identified only by 3'B (Fig. 5A). Indeed, we have recently identified genomic clones of two distinct genes 3OST3A1 and 3OST3B1, as noted under "Experimental Procedures." However, BstXI digestions suggest greater complexity as ST-3 displayed three bands of about 2.0, 1.1, and 0.5 kb in a 1:2:1 stoichiometry, respectively. 3'A detected both of the weak bands; whereas 3'B identified only the strong band (Fig. 5A). This pattern could only result from just two 3OST3 genes if a single copy 3OST3A gene has an allelic BstXI restriction fragment length polymorphism. Alternatively, if such an allelic polymorphism is not present, then the pattern must result from a minimum of four 3OST3 genes with BstXI sites differing in the two 3OST3A forms but being invariant in the two 3OST3B copies. The possibility of only two 3OST3 genes was excluded by ST-3 probing of BstXI-digested genomic DNA from an additional 16 unrelated individuals. In contrast to an allelic segregation pattern, all samples generated the identical 1:2:1 band pattern described above (data not shown).

Duplication of the amino-terminal portions of 3OST3A and 3OST3B genes was also documented by Southern analyses with the isoform-specific probes SPLAG-A and SPLAG-B, respectively (Fig. 5B). Only some of the detected fragments were predicted from a computer-generated restriction map of 3OST3A1 and 3OST3B1 gene sequences, described above. Consequently, these fragments (A1 and B1 in Fig. 5B) are derived from 3OST3A1 and 3OST3B1. The additional unanticipated bands (A2 and B2) document the duplicated amino-terminal regions of genes 3OST3A2 and 3OST3B2. Single bands were occasionally detected (A1 & 2 and B1 & 2 in Fig. 5), which indicates conservation of sequence within a gene pair. We conclude from the above data that the human genome contains two 3OST3A genes and two 3OST3B genes.

Inspection of genomic clone sequence also reveals that discriminating 0.5- and 2.0-kb BstXI fragments are derived from 3OST3A1 and 3OST3A2 genes, respectively. Similarly, a BamHI polymorphism between ST-3 and 3'B differentiates the genes 3OST3B1 from 3OST3B2, respectively.6 Examination of the individual cDNAs inserts that encompass these defining regions shows the 3-OST-3A clone pJL-3.4 derives from 3OST3A1; whereas the 3-OST-3B clones pJL-3.6, -3.7, and -3.9 all originate from 3OST3B1. However, the limited number of analyzed clones are insufficient to exclude functionality of 3OST3A2 and 3OST3B2. Moreover, it remains unclear whether each gene pair produces identical or distinct products.

Southern analysis does not always resolve each member of a human 3OST3 gene pair (e.g. EcoRI of Fig. 5), suggesting a high degree of sequence homology between each pair of copied genes. Accordingly, we assessed the extent of similarity by performing genomic restriction mapping on the 3'-untranslated regions of 3OST3A and 3OST3B forms, given that 3'-untranslated region sequences are typically divergent, even within multigene families (36). The data demonstrate a high degree of identity in the 3'-untranslated regions of each pair of copies; indeed, discrimination between members of each gene pair was not observed with any of the employed enzyme combinations (Fig. 6). This suggests either a very late duplication of 3OST3A and 3OST3B forms, or a concerted mechanism, i.e. gene conversion, to maintain primary structures. We note the murine genome must contain at least one copy of both forms,7 which indicates that human 3OST3A and 3OST3B genes cannot have resulted from late duplication. Accordingly, the human 3OST3 genes have apparently been subjected to gene conversion. At the minimum, gene conversion homogenizes the sulfotransferase domain sequences between human 3OST3A1 and 3OST3B1 loci. It is even possible that conversion maintains the 3' structural similarities between the 3OST3A gene pair and between the 3OST3B gene pair.


View larger version (16K):
[in this window]
[in a new window]
 
Fig. 6.   Genomic restriction mapping of 3'-untranslated regions of 3OST3A and 3OST3B gene pairs. Restriction mapping of genomic DNA by Southern hybridization with isoform-specific probes reveals a conservation in 3'-untranslated region restriction sites for 3OST3A and 3OST3B gene pairs. Blots were consecutively hybridized to 3'-specific probes (3'A and 3'B), and the 3-OST-3 common probe (ST-3). Genomic DNA was double digested with BstEII (sulfotransferase domain cleavage) in conjunction with enzymes that cleave near the known 3' limit of 3-OST-3A (EcoRV, BclI) or 3-OST-3B (BsaI, AvrII) cDNAs. The obtained fragments indicate the distance of each 3' site from BstEII. Restriction site position was confirmed by triple digests supplemented with EcoNI. Each 3'-specific probes detected single fragments for all tested enzyme combinations. Thus, both members of a gene pair have identical restriction site maps for these enzymes. Upper, deduced restriction maps for the 3'-portions for 3OST3A and 3OST3B gene pairs. Regions that are colinear with 3-OST-3A and 3-OST-3B cDNA sequences are schematically represented as described in the legend to Fig. 1. Presented are the size and location of hybridization probes. Lower, detected restriction fragments. Thick lines indicate anticipated fragments predicted from the respective cDNA structures, whereas thin lines represent novel products generated by scission of downstream unknown sequences.

Chromosomal Localization of Mouse 3Ost Loci-- The mouse chromosomal location of each 3Ost locus was determined by interspecific back-cross analysis using progeny derived from matings of [(C57BL/6J × M. spretus)F1 × C57BL/6J] mice. This interspecific back-cross mapping panel has been typed for over 2500 loci that are well distributed among all the autosomes as well as the X chromosome (27). C57BL/6J and M. spretus DNAs were digested with several enzymes and analyzed by Southern blot hybridization for informative restriction fragment length polymorphisms using cDNA probes specific for each gene. The strain distribution pattern of each polymorphism in the interspecific back-cross mice was then determined and used to position the 3Ost loci on the interspecific map (Fig. 7).


View larger version (23K):
[in this window]
[in a new window]
 
Fig. 7.   Murine chromosomal location of the 3Ost loci. The segregation patterns of the 3Ost loci and their flanking genes in back-cross animals that were typed for all loci are shown at the top of each chromosome. For individual pairs of loci, more animals were typed (see text). Each column represents the chromosome identified in the back-cross progeny that was inherited from the (C57BL/6J × M. spretus) F1 parent. The shaded boxes represent the presence of a C57BL/6J allele and white boxes represent the presence of a M. spretus allele. The number of offspring inheriting each type of chromosome is listed at the bottom of each column. Partial chromosome 5, 7, and 11 linkage maps indicating the location of the 3Ost loci in relation to linked genes are shown at the bottom of the figure. The number of recombinant N2 animals over the total number of N2 animals typed plus the recombination frequency, expressed as genetic distance in centimorgans (±S.E.), is shown for each pair of loci to the left of each chromosome map. When no recombination was detected between loci, the upper 95% confidence limit of the recombination distance is given in parentheses. The positions of loci in human chromosomes, where known, are shown to the right. References for the human map positions of loci cited in this study can be obtained from The Genome Data Base, a computerized data base of human linkage information maintained by The William H. Welch Medical Library of The Johns Hopkins University (Baltimore, MD).

3Ost1 mapped to the proximal region of mouse chromosome 5, 0.5 centimorgan distal of Msx1 and 3.7 centimorgan proximal of Bst1. 3Ost2 and 3Ost4 mapped to the distal region of chromosome 7: 3Ost2 did not recombine with Pkcb in 165 animals typed in common, suggesting that the two loci are within 1.8 centimorgans (upper 95% confidence limit), and 3Ost4 mapped 2.3 centimorgans distal of this cluster and 0.7 centimorgans proximal of Spn. Finally, 3Ost3a and 3Ost3b mapped to the central region of mouse chromosome 11 and did not recombine with each other in 141 mice typed in common, suggesting the two loci are within 2.1 centimorgans of each other (upper 95% confidence limit). The cluster of the two murine 3Ost3 genes is 3.8 centimorgans distal of Csfgm and 2.4 centimorgans proximal of Myhsf1 on mouse 11. The very tight linkage between 3Ost3a and 3Ost3b suggests that the genes arose by a tandem duplication event.

We have compared our interspecific map of chromosomes 5, 7, and 11 with a composite mouse linkage map that reports the map location of many uncloned mouse mutations (provided from Mouse Genome Data Base, a computerized data base maintained at The Jackson Laboratory, Bar Harbor, ME). The 3Ost loci mapped in regions of the composite map that lack mouse mutations with a phenotype that might be expected for an alteration in these loci (data not shown).

The proximal region of mouse chromosome 5 shares a region of homology with human chromosome 4p (Fig. 7). Our placement of 3Ost1 in this interval suggests that the human isolog 3OST1 will map to 4p, as well. The distal region of mouse chromosome 7 shares regions of homology with human chromosomes 11p, 16p, and 10q. Both Pkcb and Spn have been mapped to 16p in human chromosomes. The tight linkage in mouse between Pkcb and Spn, and 3Ost2 and 3Ost4 suggests that the human isologs 3OST2 and 3OST4 will also map to human 16p. Indeed, the identification of cloned mapping markers confirms that 3OST2 and 3OST4 localize to human 16p12 and 16p11.2, as described under "Experimental Procedures." Similarly, 3Ost3a and 3Ost3b map between Csfgm and Myhsf1 in mouse. These two latter genes have been assigned to 5q31 and 17pter-p11 in humans, respectively, which suggests the human 3OST3 genes will map to 5q or 17p. The identification of cloned markers resolves this ambiguity and shows that 3OST3A1 and 3OST3B1 both localize to 17p12-p11.2. The 3OST chromosomal regions lack human disorders with a phenotype that might be expected for an alteration in these loci (data not shown).

Tissue and Cell-type Specific Expression of Multiple Transcripts

Northern analyses with isoform-specific probes reveal tissue-specific expression for members of the multigene family (Fig. 8). Moreover, the more ubiquitously expressed members produce multiple transcripts that predominantly show coordinate regulation. 3-OST-4 exhibits the most selective pattern with only a single transcript detected in brain. The transcripts of 3-OST-2 predominantly occur in brain, but low expression is also observed in heart, placenta, lung, and skeletal muscle. Levels of the two 3-OST-1 transcripts are predominant in kidney and brain, intermediate in heart and lung, and low but detectable in the remaining analyzed organs (Fig. 8). The 3-OST-3 forms show the broadest expression pattern and the largest number of transcript forms. Although most tissues express both 3-OST-3A and 3-OST-3B, quantitative differences are evident. For example, the highest expression of 3-OST-3A occurs in heart and placenta, whereas 3-OST-3B is most abundant in liver and placenta. Furthermore, each tissue exhibits a distinct ratio of 3-OST-3A subtypes and 3-OST-3B subtypes. For 3-OST-2, 3-OST-3A, and 3-OST-3B the small transcripts of minor abundance are alternative splice variants that encode the unusual carboxyl-terminal fragments. The characterization of these and additional 3-OST-3 transcript classes shall be provided in a separate communication (as described above).


View larger version (53K):
[in this window]
[in a new window]
 
Fig. 8.   Tissue-specific expression of human 3-OST isoforms. Northern blot analysis was performed on 5 µg of poly(A)+ RNA isolated from various human organs (). Displayed are transcript sizes determined by co-electrophoresis of mRNA size standards. Membranes were hybridized with 32P-labeled isoform-specific probes, as described under "Experimental Procedures." In particular, 3-OST-3A and 3-OST-3B isoforms were detected with the 3'-untranslated region-specific probes. Note that this analysis must at least detect products of 3OST3A1 and 3OST3B1 genes, but potentially may also reveal transcripts from the structurally similar 3OST3A2 and 3OST3B2 genes, respectively. Note that with ST-3, the 3-OST-3 common sulfotransferase domain probe, the liver 2.6-kb 3-OST-3A transcript and the 6.2-kb 3-OST-3B transcript exhibited equal intensities, which provides a reference point for comparing 3-OST-3A and 3-OST-3B expression levels.

Interestingly, 3-OST-3 versus 3-OST-2 and 3-OST-4 transcripts show essentially reciprocal tissue expression. In contrast, the tissue-specific pattern of 3-OST-1 has overlap with all other types. However, Northern analysis of RNA samples from immortalized and primary endothelial cells that have previously demonstrated 3-OST-1 transcripts (12); failed to detect expression of 3-OST-2, -3, or -4 isoforms with specific probes (data not shown). Thus, 3-OST forms are also expressed in a cell type-specific fashion. Indeed, a Northern survey of several immortalize nonendothelial cell lines with 3-OST-3 probes reveals cells that express exclusively 3-OST-3A, exclusively 3-OST-3B, or varying proportions of both transcript types.8

Predicted Protein Structures

Extensive data base searching revealed the full-length 3-OST-2, -3A, and -3B enzymes and the partial length 3-OST-4 sequence to all be novel proteins. The 3-OST-2, -3A, and -3B cDNAs predict type II integral membrane proteins (37) of 367, 406, and 390 residues, respectively. Each is comprised of four domains beginning with a short (19-32 residues) amino-terminal cytoplasmic tail that exhibits a net positive charge (3-OST-2, -3A, and -3B contain 32, 12, and 19% basic residues but only 0, 4.2, and 3.1% acidic residues, respectively) and terminates with 2 or 3 basic residues (Figs. 2, 3, and 4). Interestingly, this segment of 3-OST-3B contains a polyproline run of 7 residues (residues 22-28).

The second domain is hydrophobic, has a high probability of forming alpha -helix, and is flanked by charged residues; thus, is anticipated to function as a membrane spanning segment (Figs. 2, 3, and 4). Kyte-Doolittle hydropathy analysis reveals this section to be the only hydrophobic region of sufficient length to cross a membrane. The lengths of the hydrophobic regions of 3-OST-2 and -3-OST-3A (22 and 19 residues, respectively) are typical for transmembrane domains; however, 3-OST-3B has a 33-amino acid stretch of hydrophobic groups. Although the extent to which the 3-OST-3B hydrophobic region is buried in the membrane is presently unclear, sequence analysis with trained neural networks favors a transmembrane helix extending from Leu35 to Gly53 (21). Interestingly, the hydrophobic regions contain 3, 2, and 5 Cys residues (3-OST-2, -3A, and -3B, respectively), which is atypical of transmembrane domains.

The third domain ranges from 67 to 104 residues, and is designated as the SPLAG domain due to an extreme enrichment in Ser, Pro, Leu, Ala, and Gly (comprising 69, 62, and 70% of third domain residues in 3-OST-2, -3A, and -3B, respectively). Consequently, this region is predicted to be predominantly devoid of secondary structure, with only 4.4, 10, and 13% of contained residues having potential to form alpha -helix or beta -sheet, for 3-OST-2, -3A, and -3B, respectively. Thus, this segment is likely to act as a flexible stem which links the catalytic sulfotransferase domain to the membrane anchor. Only the stem region of 3-OST-2 contains cysteines (two residues present), with Cys55 and Cys73 potentially forming a disulfide bond that generates a peptide loop of 19 amino acids (Fig. 2). Within the SPLAG domain, 3-OST-2 contains a single potential N-glycosylation site but all enzymes harbor potential O-glycosylation sites (5, 2, and 6 sites for 3-OST-2, -3A, and -3B) with mucin-like clustering (Figs. 2-4). A similarly high enrichment of SPLAG residues occurs in the amino-terminal stretch that abuts the sulfotransferase domain of the intraluminal resident 3-OST-1, and also in the putative stem regions of the type II structured NST-1, NST-2, and heparan sulfate D-glucosaminyl 6-O-sulfotransferase (6-OST) but not in the stem of heparan sulfate uronosyl 2-O-sulfotransferase (2-OST) (SPLAG abundance 50, 59, 63, 52, and 21% in residues 21-52, 40-78, 43-83, 23-69, and 28-65, respectively, accession numbers given under Fig. 9). Despite the shared composition, the SPLAG domains of these enzymes do not show significant homology of the primary sequences.


View larger version (70K):
[in this window]
[in a new window]
 
Fig. 9.   Comparison of sulfotransferase domains. A, the program "Pileup" was used to align amino acid sequences of the HS1 type sulfotransferase domains from the human 3-OST and NST isoforms, described below. Indicated are sequence spans highly conserved among virtually all sulfotransferases that are predicted to serve in 5'-phosphate binding/lysyl catalysis (··K··), 3'-phosphate binding (··3' PO4··), and possibly 5'-sulfate interaction (··? 5' SO4). Also shown is the presumptive cystine-bridged region that is conserved among HS1 type sulfotransferase domains ({··C··C··}). Consensus residues (shaded) are indicated for each position where at least 5 candidates exhibit identical or similar amino acids. Numeration is given for each full-length enzyme and for a consensus sequence. B, a dendrogram comparing the relatedness of sulfotransferase domains from all known and potential heparan sulfate sulfotransferases was generated by calculating the average similarity of aligned amino acid sequences from the respective carboxyl-terminal regions (residue spans in brackets), as determined by the GCG program "Distances." Prefixes indicate organism of origin: ce, C. elegans; cg, Cricetulus griseus; cl, Cricetulus longicaudatus (Chinese hamster); dm, Drosophila melanogaster; h, Homo sapiens; m, M. musculus; r, Rattus norvegicus. The analysis includes the known carboxyl-terminal 250 residues of 3-OST-4, h3-OST-1 (GenBankTM accession number AFO19386), m3-OST-1 (AFO19385) (12), and the presumptive homolog ce3-OST; hNST-1 (U36600) (39), rNST-1 (M92042) (45), hNST-2 (U36601), mNST-2 (U02304) (44, 46), and presumptive homolog ceNST (U52002) (64); cl2-OST (D88811) (40) as well as presumptive homologs dm2-OST (SD, X60218) (30) and ce2-OST; h6-OST (AB006179) and cg6-OST (AB006180) (49). BLAST and Genefinder analysis of genomic cosmids predicts ce3-OST to be an intraluminal resident protein of 291 residues, encoded by 4 exons (clone F52B10, GenBankTM accession number U41990; residues 26317-26090, 21886-21732, 21682-21395, and 21345-21140); whereas, ce2-OST is likely a type II integral membrane protein of 324 amino acids with coding regions encompassed within 8 exons (clone C34F6, GenBankTM accession number Z81479; residues 17495-17358, 17303-17199, 17150-17044, 16952-16817, 16505-16408, 16361-16198, 15714-15573, and 14947-14864), respectively (64).

The final region of ~260 residues extends to the carboxyl terminus and is the putative sulfotransferase domain. Although the 3-OST-2, -3A, and -3B enzymes all show a common regional organization, only the primary structures of the sulfotransferase domain show significant homology (Fig. 9A). Indeed, the 3-OST-3A and 3-OST-3B sulfotransferase domains are almost identical, except the 3-OST-3A form contains an additional carboxyl-terminal residue (Gly406). As described above, this identity results from the 3-OST-3A and 3-OST-3B cDNAs exclusively sharing a common sulfotransferase domain sequence. The entire sulfotransferase domain is extremely basic (about 20% His, Lys, Arg versus 10% Glu and Asp); however, this region does not exhibit previously recognized heparin binding motifs (38). Only two cysteine residues are present, which are closely spaced and could form a disulfide bond to generates peptide loops of 13 amino acids, respectively (Figs. 2-4). The 3-OST-2 and the common 3-OST-3 domains contain 3 and 2 potential sites for N-glycosylation but all show a single potential O-glycosylation site. Interestingly, all 3-OST enzymes show a conserved potential N-glycosylation signal just before the potential peptide loop (Fig. 9A, consensus residues 214-216).

    DISCUSSION

The 3-OST Multigene Family and Heparan Diversity-- Heparan sulfate proteoglycans bearing glycosaminoglycans with distinct fine structures have been implicated in a myriad of biologic roles; however, the means to independently regulate the production of such a broad array of functionally important structures has remained largely unclear. Indeed, such a mechanism is only exemplified by the rate-limiting action of 3-OST-1. To find new candidates for regulating heparan sulfate structure, we identified expressed sequence tag clones homologous to the sulfotransferase domain of 3-OST-1 and subsequently isolated human cDNAs encoding 3-OST-2, -3A, -3B, and an incomplete clone of 3-OST-4. We also obtained novel splice variants encoding carboxyl-terminal fragments, which shall be separately described. Southern analyses revealed a surprisingly extensive multigene family, with 7 human members (3OST1, 3OST2, 3OST3A1, 3OST3A2, 3OST3B1, 3OST3B2, and 3OST4). However, the functionality of 3OST3A2 and 3OST3B2 remains to be established. Localization of the mouse isologs (3Ost1, 3Ost2, 3Ost3a, 3Ost3b, and 3Ost4) and bioinformatic identification of cloned markers predicts the chromosomal loci of the corresponding human genes. These analyses suggest that the human genes are not candidates for previously mapped genetic disorders.

Northern analyses show that the human 3-OST genes are differentially regulated in both tissue and cell type-specific fashions, testifying to distinct functional roles. Moreover, multiple transcript sizes occur for most isoforms. Multiplicity has also been observed for the transcripts of heparan biosynthetic enzymes NST-1, 2-OST, and uronosyl C5-epimerase (39-41). Additional mRNAs might engender enhanced regulatory control or distinct functional properties. On one hand, the two 3-OST-1 messages probably differ by alternative splicing within the 5'-untranslated region, which occurs extensively for the murine counterpart (12). Such differences in noncoding regions can provide for differential regulation of translational efficiency or message accumulation (42, 43). On the other hand, alternative splicing within the coding region produces minor transcript variants of 3-OST-2, -3A, and -3B, which encode carboxyl-terminal fragments that likely serve a nonenzymatic function. Presumably, the large number of 3-OST-3 transcripts implies participation in several biologic processes.

Distinct biologic roles for each isoform is also indicated by our elucidation that 3-OST-1, -2, and -3 forms each generate unique 3-O-sulfated structures (34). Given the paucity of 3-O-sulfated glucosaminyl residues within heparan sulfate (7, 23), the novel isoforms may mimic 3-OST-1 by functioning in a critical rate-limiting capacity (5, 10). The newly isolated enzymes should then serve as key regulatory components that enhance the functional diversity of heparan sulfate. We speculate that 3-OST-2 may play a role in the nervous system, whereas the 3-OST-3 isoforms might contribute to the permselectivity of the glomerular basement membrane (elaborated in Ref. 34). However, the extreme complexity of the multigene family suggests these enzymes may serve to modulate a rather diverse array of biologic functions.

Structural Features of the Divergent Amino-terminal Region-- Examination of the deduced structures of the novel enzymes reveals several common as well as distinctive features and provides a foundation for exploring the molecular basis of heparan sequence diversity. The 3-OST-2, -3A, and -3B enzymes are type II integral membrane proteins and so are structurally comparable to all previously cloned glycosaminoglycan biosynthetic enzymes except for 3-OST-1, which has an intraluminal resident style (12, 40, 41, 44-49). The architecture of type II enzymes is akin to that of the glycosyltransferases (46), which show two major functional regions. The large carboxyl-terminal region accounts for most of the intraluminal portion and forms a globular catalytic domain. The smaller amino-terminal region encompasses the cytoplasmic, transmembrane, and flexible stem domains; however, residues from each of these regions have been shown to direct localization to Golgi subcompartments (50). Thus, the entire amino-terminal region may be considered in terms of compartmentalization and protein-protein interactions.

The 3-OST family parallels this division via the conserved carboxyl-terminal sulfotransferase domain and the divergent amino-terminal regions. That these two regions may be functionally discrete is supported by examination of the presumptive Caenorhabditis elegans 3-OST. In this organism, we have identified only a single gene and the encoded enzyme shows features of a primordial 3-OST. Specifically, the sulfotransferase domain is most closely related to the type II enzymes (Fig. 9B); however, the amino-terminal domain shows an intraluminal resident style like 3-OST-1.9 If this hybrid structure represents the primordial enzyme, then the type II amino-terminal domain must have evolved long after the elaboration of a functional sulfotransferase domain. Functional distinctiveness is also favored by the determination that 3-OST-3A and 3-OST-3B generate identical 3-O-sulfated disaccharides (34). Thus, sulfation specificity corresponds to the nearly identical sulfotransferase domains and is not perturbed by the unique amino-terminal regions.

That the amino-terminal region serves a compartmentalization/protein interaction role is supported by an analysis of NST-1, which occurs in the trans-Golgi network. The amino-terminal 161 residues are sufficient for retention within the Golgi (51). Within this region of NST-1, NST-2, and 6-OST the flexible stem shows a SPLAG enrichment comparable to the 3-OST stem region (SPLAG domain). However, the absence of such enrichment in 2-OST suggests that extreme SPLAG enrichment is not exclusively necessary for conveying flexibility and the SPLAG domain may thereby participate in an additional process, such as compartmentalization. Such a role could also account for the intraluminal retention of 3-OST-1, which is simply composed of an amino-terminal SPLAG domain fused to a carboxyl-terminal sulfotransferase domain. Compartmentalization/protein interactions may additionally involve residues within the transmembrane region or the cytoplasmic tail. In the first case, the unusual placement of cysteine residues within the transmembrane segment of 3-OST-2, -3A, and -3B raises the possibility of a covalent interaction with a retention partner or with biosynthetic components. Such a role has previously been proposed for the conserved cysteine residue that occurs in the membrane spanning domain of the syndecan-1 core protein (52). In the second case, the cytoplasmic tail of 3-OST-3B contains a polyproline tract. Poly-L-proline can form a rigid left-handed-helix and such motifs are critical elements bound by protein interaction modules such as SH3 and WW domains (53-55). In summation, protein-protein interactions within the amino-terminal regions may control the formation of specific heparan sulfate sequences by constraining the enzyme's spatial organization or functional interactions. Consequently, the unique amino-terminal regions of 3-OST-3A and 3-OST-3B may engender distinctive biologic roles to the virtually identical sulfotransferase domains.

Structural Features of the Conserved Sulfotransferase Domain-- 3-OST family members are defined by the highly conserved sulfotransferase domain. The importance of this structure is highlighted by our finding that gene conversion maintains virtually identical sulfotransferase domains between 3OST3A1 and 3OST3B1 genes. Gene conversion occurs in the germ line as a transfer of genetic information from donor to acceptor loci without alternation of the donor material. This process can prevent mutational drift and proceeds quite efficiently between nonallelic loci on the same chromosome (reviewed in Ref. 56), which would be constant with the proposed 3OST3 multigene cluster. It is especially striking that the limits of the converted DNA sequence correspond exactly to the limits of the sulfotransferase domain of 3-OST-3B.

We have previously employed simultaneous multiple sequence alignment to shown that the sulfotransferase domain of 3-OST-1 shows homology to a broad range of sulfotransferases, including cytosolic and Golgi enzymes isolated from animals, plants, and bacteria (12). Critical features are revealed by extending this comparison to include virtually all known adenosine 3'-phosphate 5'-phosphosulfate (PAPS) requiring sulfotransferases found in GenBankTM. Collectively, this group modifies a broad range of molecules, yet these enzymes show a 260-290 carboxyl-terminal region with at least 25-30% similarity to each 3-OST sulfotransferase domain. Such conservation reflects common structural and functional constraints imposed by the obligate cofactor PAPS (57). In particular, we have observed that the consensus sequence (L/I/V)3-4-X3-5-K-S-G-T-X1-2-(W/L) occurs near the amino terminus of the sulfotransferase domain of all enzymes (each consensus residue occurs in at least 50 of 66 tested sequences, minor conservative substitutions not presented). This consensus predominantly overlaps conserved region I (of cytosolic sulfotransferases) that appears to be a critical active site component, as indicated by affinity labeling with a PAPS analog and by mutational analysis (58-60). The central basic residue, typically lysine (92%), is considered essential for stabilization of a transition state intermediate, as the Lys right-arrow Ala mutant of flavonol 3-sulfotransferase dramatically reduces enzymatic activity with minimal affect on PAPS binding (59). These assertions are confirmed by x-ray crystallography of the estrogen sulfotransferase bound to adenosine 3'-phosphate 5'-phosphate (a PAPS analog), where the consensus region forms a beta -strand/P-loop/alpha -helix motif. The P-loop corresponds to the underlined tetrapeptide and amide nitrogens from each residue may hydrogen bond with the 5'-phosphate. Moreover, Nzeta from the central lysine neutralizes the negative charge of this phosphate (61). Thus, the above consensus ascribes a fundamental sulfotransferase structure that is critically required for both the binding PAPS and the catalysis of sulfate transfer. This consensus region is almost invariant among the human 3-OST enzymes (Fig. 9A) and secondary structure analysis predicts a strand-loop-helix motif for each enzyme. Moreover, the conserved lysine occurs in all heparan sulfate sulfotransferases and likely serves an equivalent catalytic role. Indeed, alanine mutagenesis of the conserved lysyl has recently been shown to dramatically reduce sulfotransferase activity of 3-OST-110 and NST-1 (62).

A second, less well conserved, consensus K-(aliphatic)5-R-N-X2-(D/E)-X3-S-X-Y forms a sheet-turn-helix structure in the estrogen sulfotransferase and side groups from underlined residues interact with oxygens of the 3'-phosphate (61). This region is predicted to form a sheet-loop-helix structure in the 3-OSTs which would also be consistent with phosphate binding. Recently, Kakuta et al. (57) have similarly noted the importance of the above two regions (57); however, our analysis additionally reveals a previously unidentified structure, G-X(W/Y)-X2-H-X3-(W/L)2. We have determined that this third sequence maps to a loop-helix structure at the active site and the underlined residues are in a vicinity to approximate the 5'-sulfate of PAPS. These interactions could facilitate sulfate binding or enzymatic transfer. Of course, such contacts could not have been crystallographically observed because estrogen sulfotransferase was co-crystallized with the sulfate-free analog adenosine 3'-phosphate 5'-phosphate (61). This potential sulfate interaction region is predicted to also form a loop-helix structure in the 3-OSTs.

Comparison of just the heparan sulfate sulfotransferases allows the designation of three distinct types of sulfotransferase domains (HS1, HS2, and HS3; Fig. 9B). Although four sulfotransferase families are clearly delineated, the N- and 3-O-groupings both possess a related HS1 structure (~50% similarity between families) (Fig. 9B). Presumably, unique features of individual sulfotransferase domains enable discrimination of distinct precursor structures and thereby provide a mechanism for generating and regulating heparan sulfate sequence diversity. In this regard, the HS1 form is distinguished by a carboxyl-terminal region of ~30 residues that contains the presumptive cystine-bridged peptide loop (Fig. 9A, consensus residues 211-240). Within this highly conserved region, cysteines are invariant but the intervening 8-11 amino acids are poorly conserved. Indeed, the peptide loop is structurally distinct for each 3-OST isoform. Thus, this variable loop might serve to discriminate between different heparan sulfate structures and thereby account for the distinct sequences generated by individual 3-OST isoforms (34).

In conclusion, the multiple functions of heparan sulfate proteoglycans necessitate a biosynthetic mechanism that tightly regulates the generation of a myriad of distinct heparan sulfate fine structures. The paradigm of 3-OST-1 shows that a biologic activity of heparan sulfate can be individually regulated by controlling the level of a sulfotransferase that contributes a rare modification to complete the formation of a critical heparan sulfate sequence. The utility of this mechanism may account for the large number of 3OST genes with distinct tissue and cell type-specific expression patterns. 3-OST isoforms with different sulfotransferase domains differentially place the rare 3-O-sulfate in different sequence contexts to presumably regulate discrete biologic activities. This capacity of the sulfotransferase domain to generate distinct sequences may in turn be modulated by distinct amino-terminal domains. The elucidation of the critical nonconserved and conserved residues which determine the sequence specificity for sulfation and enzyme interactive properties is fundamental groundwork toward understanding the regulated production of defined monosaccharide sequences.

    ACKNOWLEDGEMENTS

We thank Linda M. S. Fritze and Debra J. Gilbert for excellent technical assistance. We are grateful for the technical expertise of Dr. Richard D. Cook and members of the HHMI/MIT Biopolymers Lab as well as Pushba Srivastava of the Molecular Medicine Unit (Beth Israel Deaconess Medical Center) for assistance in automated DNA sequencing. We thank members of the Rosenberg laboratory for insightful comments.

    FOOTNOTES

* This work is supported in part by National Institutes of Health Grant 5-PO1-HL-41484, and the National Cancer Institute, Department of Health and Human Services, under contract with Advanced BioScience Laboratories, Inc.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) AF105374, AF105375, AF105376, AF105377, and AF105378.

To whom correspondence and reprint requests should be addressed. Present address: Angiogenesis Research Center, Beth Israel Deaconess Medical Center, SL-418, 330 Brookline Ave., Boston, MA 02215. Fax: 617-975-5201; E-mail: nshworak{at}caregroup.harvard.edu.

parallel Recipient of an American Heart Association, Massachusetts Affiliate, Postdoctoral Fellowship.

** Recipient of a National Institutes of Health Postdoctoral Fellowship.

    ABBREVIATIONS

The abbreviations used are: 3-OST, heparan sulfate D-glucosaminyl 3-O-sulfotransferase; NST, heparan sulfate N-deacetylase/N-sulfotransferase; 3Ost, mouse 3-OST genes; 3OST, human 3-OST genes; B, C57BL/6J; S, M. spretus; SPLAG, Ser, Pro, Leu, Ala, and Gly enriched; 2-OST, heparan sulfate uronosyl 2-O-sulfotransferase; 6-OST, heparan sulfate D-glucosaminyl 6-O-sulfotransferase; PAPS, adenosine 3'-phosphate 5'-phosphosulfate; bp, base pair(s); kb, kilobase pair(s).

2 Where -GlcN is (alpha 1right-arrow4)D-glucosamine; -GlcA is (beta 1right-arrow4)D-glucuronic acid; -IdoA is (alpha 1right-arrow4)L-iduronic acid; NS, 2S, 3S, and 6S, respectively, define sulfate substituents in amino, 2-O, 3-O, or 6-O positions; and Ac is an acetyl group.

3 J. Inazawa, M. Isobe, N. G. Copeland, T. Kaisho, T. Mori, M. Itoh, K. Ishihara, D. J. Gilbert, N. A. Jenkins, and T. Hirano, submitted for publication.

4 The CTF designation denotes alternative splice variants that encode carboxyl-terminal fragments of each respective enzyme. These unusual forms are predicted to be localized to the cytosol and to lack sulfotransferase activity. These splice variants occur as minor transcripts of 1.8 (3-OST-2CTF), 1.6 (3-OST-3ACTF), and 2.4 (3-OST-3ACTF) kb. They shall be described fully in a separate article (N. W. Shworak, J. Liu, and R. D. Rosenberg, manuscript in preparation).

5 Within all five 3-OST-3B clones the region from alpha  to beta  contains 6 silent point mutations in wobble positions (C729 right-arrow G, A762 right-arrow C, C798 right-arrow T, C843 right-arrow T, C852 right-arrow T, and C876 right-arrow T), whereas both of the two 3-OST-3A clones lack these mutations (Fig. 1). If these differences reflect the sequences of two distinct alleles of the same gene, then the probability that both 3-OST-3A clones would be of either single allele is 1/2 and the probability that all 3-OST-3B clones would be of the opposite allele is (1/2)5. Thus, the observed exclusive distribution would randomly occur with a frequency of 1/2 × 1/25 = 1/64 = 0.016. Thus, it is extremely unlikely that this exclusive distribution could have resulted from allelic variation of a single gene.

6 On BamHI digests, 3'B detects two equal intensity bands but only one co-hybridizes to ST-3. In addition, ST-3 reveals an unaccounted band that is not detected with 3'B or 3'A. Given that 3OST3B1 lacks a 3' BamHI, the data indicate that 3OST3B2 contains a BamHI site between ST-3 and 3'B (N. W. Shworak, unpublished data).

7 We have found mouse expressed sequence tag clones derived from each gene (GenBankTM accession numbers W14854, W49404, and W71608 from 3Ost3a, AA254888 and AA288201 from 3Ost3b), and have detected the corresponding genes with SPLAG-A and SPLAG-B probes by heterologous hybridization to genomic DNA, as described under "Interspecific Mouse Back-cross Mapping."

8 Exclusive expression of 3-OST-3A occurs in HeLa S3 (cervical carcinoma) and G361 (melanoma) cells; exclusive expression of 3-OST-3B in HL-60 (promyelocytic leukemia), MOLT-4 (lymphoblastic leukemia), and Raji (Burkitt's lymphoma) cells; whereas, both transcript types are found in K-562 (chronic myelogenous leukemia), SW480 (colorectal adenocarcinoma), and A549 (lung carcinoma) cells (N. W. Shworak, unpublished data).

9 C. elegans 3-OST was identified from data banks as described in the legend to Fig. 9. The amino-terminal portion (residues 1-22; MKYRLLLILHLIDLISCdown-arrow GVIPN) show striking similarities to 3-OST-1. In particular, the short hydrophobic stretch with internal charged residues (single underline) and a potential signal peptidase cleavage site (down-arrow ) (63), suggest C. elegans 3-OST is an intraluminal resident just like 3-OST-1 (12). Furthermore, residues immediately preceding the sulfotransferase domain are nearly identical between C. elegans 3-OST (double underline) and human 3-OST-1 (residues 44-48, GVAPN).

10 J. Liu, unpublished data.

    REFERENCES
Top
Abstract
Introduction
References

  1. Bernfield, M., Kokenyesi, R., Kato, M., Hinkes, M. T., Spring, J., Gallo, R. L., and Lose, E. J. (1992) Annu. Rev. Cell Biol. 8, 365-393[CrossRef]
  2. Carey, D. J. (1997) Biochem. J. 327, 1-16[Medline] [Order article via Infotrieve]
  3. Rosenberg, R. D., Shworak, N. W., Liu, J., Schwartz, J. J., and Zhang, L. (1997) J. Clin. Invest. 99, 2062-2070[Free Full Text]
  4. Sanderson, R. D., Turnbull, J. E., Gallagher, J. T., and Lander, A. D. (1994) J. Biol. Chem. 269, 13100-13106[Abstract/Free Full Text]
  5. Shworak, N. W., Shirakawa, M., Colliec-Jouault, S., Liu, J., Mulligan, R. C., Birinyi, L. K., and Rosenberg, R. D. (1994) J. Biol. Chem. 269, 24941-24952[Abstract/Free Full Text]
  6. Shworak, N. W., and Rosenberg, R. D. (1995) in The Endothelial Cell in Health and Disease (Vane, J. R., Born, G. V. R., and Welzel, D., eds), pp. 119-146, Schattauer, Stuttgart
  7. Colliec-Jouault, S., Shworak, N. W., Liu, J., de Agostini, A. I., and Rosenberg, R. D. (1994) J. Biol. Chem. 269, 24953-24958[Abstract/Free Full Text]
  8. Lindahl, U., Lidholt, K., Spillmann, D., and Kjellén, L. (1994) Thromb. Res. 75, 1-32[Medline] [Order article via Infotrieve]
  9. Atha, D. H., Lormeau, J. C., Petitou, M., Rosenberg, R. D., and Choay, J. (1987) Biochemistry 26, 6454-6461[Medline] [Order article via Infotrieve]
  10. Shworak, N. W., Fritze, L. M. S., Liu, J., Butler, L. D., and Rosenberg, R. D. (1996) J. Biol. Chem. 271, 27063-27071[Abstract/Free Full Text]
  11. Liu, J., Shworak, N. W., Fritze, L. M. S., Edelberg, J. M., and Rosenberg, R. D. (1996) J. Biol. Chem. 271, 27072-27082[Abstract/Free Full Text]
  12. Shworak, N. W., Liu, J., Fritze, L. M. S., Schwartz, J. J., Zhang, L., Logeart, D., and Rosenberg, R. D. (1997) J. Biol. Chem. 272, 28008-28019[Abstract/Free Full Text]
  13. Lennon, G., Auffray, C., Polymeropoulos, M., and Soares, M. B. (1996) Genomics 33, 151-152[CrossRef][Medline] [Order article via Infotrieve]
  14. Houlgatte, R., Mariage-Samson, R., Duprat, S., Tessier, A., Bentolila, S., Lamy, B., and Auffray, C. (1995) Genome Res. 5, 272-304[Abstract]
  15. Auffray, C., Behar, G., Bois, F., Bouchier, C., da Silva, C., Devignes, M. D., Duprat, S., Houlgatte, R., Jumeau, M. N., Lamy, B., Lorenzo, F., Mitchell, H., Mariage-Samson, R., Pietu, G., Pouliot, Y., Sebastiani-Kabaktchis, C., and Tessier, A. (1995) C. R. Acad. Sci. Paris Ser. III 318, 263-272[Medline] [Order article via Infotrieve]
  16. Adams, M. D., Kerlavage, A. R., Fleischmann, R. D., Fuldner, R. A., Bult, C. J., Lee, N., Kirkness, E. F., Weinstock, K. G., Gocayne, J. D., White, O., Sutton, G., Blake, J. A., Brandon, R. C., Chiu, M.-W., Clayton, R. A., Cline, R. T., Cotton, M. D., Earle-Hughes, J., Fine, L. D., FitzGerald, L. M., FitzHugh, W. M., Fritchman, J. L., Geoghagen, N. S. M., Glodek, A., Gnehm, C. L., Hanna, M. C., Hedblom, E., Hinkle, P. S., Jr., Kelley, J. M., Klimek, K. M., Kelley, J. C., Liu, L.-I., Marmaros, S. M., Merrick, J. M., Moreno-Palanques, R. F., McDonald, L. A., Nguyen, D. T., Pellegrino, S. M., Phillips, C. A., Ryder, S. E., Scott, J. L., Saudek, D. M., Shirley, R., Small, K. V., Spriggs, T. A., Utterback, T. R., Weidman, J. F., Li, Y., Bednarik, D. P., Cao, L., Cepeda, M. A., Coleman, T. A., Collins, E.-J., Dimke, D., Feng, P., Ferrie, A., Fischer, C., Hastings, G. A., He, W.-W., Hu, J.-S., Greene, J. M., Gruber, J., Hudson, P., Kim, A., Kozak, D. L., Kunsch, C., Ji, H., Li, H., Meissner, P. S., Olsen, H., Raymond, L., Fannon, M. R., Rosen, C. A., Haseltine, W. A., Fields, C., Fraser, C. M., and Venter, J. C. (1995) Nature 377, 3-174[Medline] [Order article via Infotrieve]
  17. Khan, A. S., Wilcox, A. S., Polymeropoulos, M. H., Hopkins, J. A., Stevens, T. J., Robinson, M., Orpana, A. K., and Sikela, J. M. (1992) Nat. Genet. 2, 180-185[Medline] [Order article via Infotrieve]
  18. Berry, R., Stevens, T. J., Walter, N. A., Wilcox, A. S., Rubano, T., Hopkins, J. A., Weber, J., Goold, R., Soares, M. B., and Sikela, J. M. (1995) Nat. Genet. 10, 415-423[Medline] [Order article via Infotrieve]
  19. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Nucleic Acids Res. 25, 3389-3402[Abstract/Free Full Text]
  20. Rost, B., and Sander, C. (1994) Proteins 19, 55-72[Medline] [Order article via Infotrieve]
  21. Rost, B., Casadio, R., Fariselli, P., and Sander, C. (1995) Protein Sci. 4, 521-533[Abstract/Free Full Text]
  22. Hansen, J. E., Lund, O., Rapacki, K., and Brunak, S. (1997) Nucleic Acids Res. 25, 278-282[Abstract/Free Full Text]
  23. Shworak, N. W., Shirakawa, M., Mulligan, R. C., and Rosenberg, R. D. (1994) J. Biol. Chem. 269, 21204-21214[Abstract/Free Full Text]
  24. de Agostini, A. L., Lau, H. K., Leone, C., Youssoufian, H., and Rosenberg, R. D. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 9784-9788[Abstract]
  25. Wahl, G. M., Berger, S. L., and Kimmel, A. R. (1987) in Methods in Enzymology (Berger, S. L., and Kimmel, A. R., eds), Vol. 152, pp. 399-407, Academic Press Inc., San Diego
  26. Casey, J., and Davidson, N. (1977) Nucleic Acids Res. 4, 1539-1552[Abstract]
  27. Copeland, N. G., and Jenkins, N. A. (1991) Trends Genet. 7, 113-118[Medline] [Order article via Infotrieve]
  28. Jenkins, N. A., Copeland, N. G., Taylor, B. A., and Lee, B. K. (1982) J. Virol. 43, 26-36[Medline] [Order article via Infotrieve]
  29. Avraham, K. B., Givol, D., Avivi, A., Yayon, A., Copeland, N. G., and Jenkins, N. A. (1994) Genomics 21, 656-658[CrossRef][Medline] [Order article via Infotrieve]
  30. Powers, P. A., and Ganetzky, B. (1991) Genetics 129, 133-144[Abstract/Free Full Text]
  31. Pathak, B. G., Shaughnessy, J. D., Jr., Menton, P., Greeb, J., Shull, G. E., Jenkins, N. A., and Copeland, N. G. (1996) Genomics 33, 124-127[CrossRef][Medline] [Order article via Infotrieve]
  32. Buchberg, A. M., Brownell, E., Nagata, S., Jenkins, N. A., and Copeland, N. G. (1989) Genetics 122, 153-161[Abstract/Free Full Text]
  33. McKenzie, A. N. J., Li, X., Largaespada, D. A., Sato, A., Kaneda, A., Zurawski, S. M., Doyle, E. L., Milatovich, A., Francke, U., Copeland, N. G., Jenkins, N. A., and Zurawski, G. (1993) J. Immunol. 150, 5436-5444[Abstract/Free Full Text]
  34. Liu, J., Shworak, N. W., Sinay, P., Schwartz, J. J., Zhang, L., Fritze, L. M. S., and Rosenberg, R. D. (1999) J. Biol. Chem. 273, 5185-5192[CrossRef]
  35. Kozak, M. (1989) J. Cell Biol. 108, 229-241[Abstract]
  36. Michelson, A. M., and Orkin, S. H. (1983) J. Biol. Chem. 258, 15245-15254[Abstract/Free Full Text]
  37. Wickner, W. T., and Lodish, H. F. (1985) Science 230, 400-407[Medline] [Order article via Infotrieve]
  38. Cardin, A. C., and Weintraub, H. R. J. (1989) Arteriosclerosis 9, 21-32[Abstract]
  39. Dixon, J., Loftus, S. K., Gladwin, A. J., Scambler, P. J., Wasmuth, J. J., and Dixon, M. J. (1995) Genomics 26, 239-244[CrossRef][Medline] [Order article via Infotrieve]
  40. Kobayashi, M., Habuchi, H., Yoneda, M., Habuchi, O., and Kimata, K. (1997) J. Biol. Chem. 272, 13980-13985[Abstract/Free Full Text]
  41. Li, J., Hagner-McWhirter, Å., Kjellén, L., Palgi, J., Jalkanen, M., and Lindahl, U. (1997) J. Biol. Chem. 272, 28158-28163[Abstract/Free Full Text]
  42. Chopra, R., Kendall, G., Gale, R. E., Thomas, N. S., and Linch, D. C. (1996) Exp. Hematol. 24, 755-762[Medline] [Order article via Infotrieve]
  43. Chapman, R. E., and Walter, P. (1997) Curr. Biol. 7, 850-859[Medline] [Order article via Infotrieve]
  44. Orellana, A., Hirschberg, C. B., Wei, Z., Swiedler, S. J., and Ishihara, M. (1994) J. Biol. Chem. 269, 2270-2276[Abstract/Free Full Text]
  45. Hashimoto, Y., Orellana, A., Gil, G., and Hirschberg, C. B. (1992) J. Biol. Chem. 267, 15744-15750[Abstract/Free Full Text]
  46. Eriksson, I., Sandbäck, D., Ek, B., Lindahl, U., and Kjellén, L. (1994) J. Biol. Chem. 269, 10438-10443[Abstract/Free Full Text]
  47. Fukuta, M., Uchimura, K., Nakashima, K., Kato, M., Kimata, K., Shinomura, T., and Habuchi, O. (1995) J. Biol. Chem. 270, 18575-18580[Abstract/Free Full Text]
  48. Fukuta, M., Inazawa, J., Torii, T., Tsuzuki, K., Shimada, E., and Habuchi, O. (1997) J. Biol. Chem. 272, 32321-32328[Abstract/Free Full Text]
  49. Habuchi, H., Kobayashi, M., and Kimata, K. (1998) J. Biol. Chem. 273, 9208-9213[Abstract/Free Full Text]
  50. Machamer, C. E. (1993) Curr. Opin. Cell Biol. 5, 606-612[Medline] [Order article via Infotrieve]
  51. Humphries, D. E., Sullivan, B. R., Aleixo, M. D., and Snow, J. L. (1997) Biochem. J. 325, 351-357[Medline] [Order article via Infotrieve]
  52. Kojima, T., Shworak, N. W., and Rosenberg, R. D. (1992) J. Biol. Chem. 267, 4870-4877[Abstract/Free Full Text]
  53. Feng, S., Chen, J. K., Yu, H., Simon, J. A., and Schreiber, S. L. (1994) Science 266, 1241-1247[Medline] [Order article via Infotrieve]
  54. Chen, H. I., Einbond, A., Kwak, S.-J., Linn, H., Koepf, E., Peterson, S., Kelly, J. W., and Sudol, M. (1997) J. Biol. Chem. 272, 17070-17077[Abstract/Free Full Text]
  55. Rau, C., Zheng, N., Hazelwood, C., and Rau, C. (1995) Physiol. Chem. Phys. Med. NMR 27, 55-61[Medline] [Order article via Infotrieve]
  56. Schimenti, J. C. (1994) Soc. Gen. Physiol. Ser. 49, 85-91[Medline] [Order article via Infotrieve]
  57. Kakuta, Y., Pedersen, L. G., Pedersen, L. C., and Negishi, M. (1998) Trends Biochem. Sci. 23, 129-130[CrossRef][Medline] [Order article via Infotrieve]
  58. Zheng, Y., Bergold, A., and Duffel, M. W. (1994) J. Biol. Chem. 269, 30313-30319[Abstract/Free Full Text]
  59. Marsolais, F., and Varin, L. (1995) J. Biol. Chem. 270, 30458-30463[Abstract/Free Full Text]
  60. Weinshilboum, R. M., Otterness, D. M., Aksoy, I. A., Wood, T. C., Her, C., and Raftogianis, R. B. (1997) FASEB J. 11, 3-14[Abstract/Free Full Text]
  61. Kakuta, Y., Pedersen, L. G., Carter, C. W., Negishi, M., and Pedersen, L. C. (1997) Nat. Struct. Biol. 4, 904-908[Medline] [Order article via Infotrieve]
  62. Sueyoshi, T., Kakuta, Y., Pedersen, L. C., Wall, F. E., Pedersen, L. G., and Negishi, M. (1998) FEBS Lett. 433, 211-214[CrossRef][Medline] [Order article via Infotrieve]
  63. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690[Abstract]
  64. Wilson, R., Ainscough, R., Anderson, K., Baynes, C., Berks, M., Bonfield, J., Burton, J., Connell, M., Copsey, T., Cooper, J., Coulson, A., Craxton, M., Dear, S., Du, Z., Durbin, R., Favello, A., Fulton, L., Gardner, A., Green, P., Hawkins, T., Hillier, L., Jier, M., Johnston, L., Jones, M., Kershaw, J., Kirsten, J., Laister, N., Latreille, P., Lightning, J., Lloyd, C., McMurray, A., Mortimore, B., O'Callaghan, M., Parsons, J., Percy, C., Rifken, L., Roopra, A., Saunders, D., Shownkeen, R., Smaldon, N., Smith, A., Sonnhammer, E., Staden, R., Sulston, J., Thierry-Mieg, J., Thomas, K., Vaudin, M., Vaughan, K., Waterston, R., Watson, A., Weinstock, L., Wilkinson-Sproat, J., and Wohldman, P. (1994) Nature 368, 32-38[CrossRef][Medline] [Order article via Infotrieve]


Copyright © 1999 by The American Society for Biochemistry and Molecular Biology, Inc.