Genomic Organization of the 3' Region of the Human Mucin Gene MUC5B*

(Received for publication, January 28, 1997, and in revised form, April 28, 1997)

Jean-Luc Desseyn Dagger , Jean-Pierre Aubert Dagger §, Isabelle Van Seuningen Dagger , Nicole Porchet Dagger § and Anne Laine Dagger

From the Dagger  Unité 377 INSERM, Place de Verdun, 59045 Lille Cedex, and § Laboratoire de Biochimie et de Biologie Moléculaire de l'Hôpital C. Huriez, Centre Hospitalier Régional et Universitaire, 59037 Lille Cedex, France

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS AND DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
Note Added in Proof
REFERENCES


ABSTRACT

MUC5B, mapped clustered with MUC6, MUC2, and MUC5AC to chromosome 11p15.5, is a human mucin gene of which the genomic organization is being elucidated. We have recently published the sequence and the peptide organization of its huge central exon, 10,713 base pairs (bp) in length. We present here the genomic organization of its 3' region, which encompasses 10,690 bp. The genomic sequence has been completely determined. The 3' region of MUC5B is composed of 18 exons ranging in size from 32 to 781 bp, contrasting thus with the very large central exon. The sizes of the 18 introns range from 114 to 1118 bp. Some repetitive sequences were identified in four introns. The peptide deduced from the sequence of the 18 exons consists of an 808-amino acid peptide. This carboxyl-terminal region exhibits extensive sequence similarity to MUC2, MUC5AC, and von Willebrand factor, particularly the number and the positions of the cysteine residues, suggesting that this domain may be derived from a common ancestral gene. The presence in these components of a cystine knot also found in growth factors such as transforming growth factor-beta is of particular interest. Moreover, one part of this peptide is identical to the 196-amino acid sequence deduced from the cDNA clone pSM2-1, which codes for a part of the high molecular weight mucin MG1 isolated from human sublingual gland. Considering the expression pattern of MUC5B and the origin of MG1, we can thus conclude that MUC5B encodes MG1.


INTRODUCTION

Mucus is the layer that covers, protects, and lubricates the luminal surfaces of epithelial respiratory, gastrointestinal, and reproductive tracts. These basic properties are due to the viscous and viscoelastic properties of mucins, the major glycoprotein components of mucus. Mucins constitute a family of high molecular mass glycoproteins synthesized by the goblet cells of the epithelia and in some cases by submucosal glands (for more complete reviews, see Refs. 1-3).

Alterations of the biosynthesis of mucins affecting the protein core and/or the carbohydrate content linked to the peptide have been observed in numerous pathological situations such as various adenomas and carcinomas, inflammatory diseases such as cystic fibrosis, asthma, chronic bronchitis, or inflammatory bowel diseases (4-7). Moreover, the hypersecretion of mucins and the presence of alternating hydrophobic and hydrophilic domains in mucins have been shown to play a central role in the pathogenesis of cholesterol gallstones (8, 9).

All apomucins contain tandemly repeated sequences rich in threonine and/or serine. Due to the high carbohydrate content, the peptide moiety of mucins has been difficult to characterize. cDNA cloning has enabled researchers to approach the study of the mucins over the past decade. Today, the membrane-associated mucin MUC1 and the secreted MUC7 are the only mucins for which the full-length cDNA and the genomic organization have been reported (10-13). Both were revealed to be, in fact, small mucins. A complete cDNA of the large secreted mucin MUC2 (14-17) has been described. Partial cDNAs have been identified for the other human mucin genes that code for secreted mucins: MUC3 (18), MUC4 (19), MUC5AC (20-24), MUC5B (25), and MUC6 (26).

Four mucin genes are mapped to 11p15.5: MUC5AC, MUC5B, MUC2, and MUC6. (26-28). Recently, we have determined that the order of the four clustered 11p15.5 human mucin genes is tel-MUC6/MUC2/MUC5AC/MUC5B-cen (29). We have also established that MUC2, MUC5AC, and MUC5B have a consensus cysteine-rich domain found twice in MUC2 (16), at least four times in MUC5AC (21, 22, 24), and seven times in MUC5B (30).

MUC5B is expressed mainly in bronchus glands and also in submaxillary glands, endocervix, gall bladder, and pancreas (31-35). The structural organization of the peptide deduced from the nucleotide sequence of the central region of MUC5B has been published recently (30). The single large exon of 10,713 bp,1 containing all the tandem repeat domain, is, to our knowledge, the biggest described for a vertebrate gene. It codes for a 3570-amino acid peptide. Nineteen subdomains have been individualized. Most of the MUC5B subdomains show similarity to each other, creating four larger composite super-repeat units of 528 amino acids. Each super-repeat is made up of repeats consisting of an irregular repeat of 29 amino acids, one cysteine-rich subdomain (10 cysteine residues, 108 aa), and one unique sequence of 111 amino acid residues also rich in serine and threonine. The complete organization of the region downstream of the central region of the human MUC5B gene, i.e. its complete 3' region, is reported in this paper; we present here the complete genomic nucleotide sequence, the exon-intron organization, and the full cDNA sequence coding for the carboxyl-terminal domain of the human MUC5B apomucin. This domain stretches 808 amino acid residues and can be divided into six subdomains. The last five cysteine-rich subdomains exhibit extensive sequence similarity to MUC2, MUC5AC, and vWF (17, 22, 23, 36), particularly the number and the positions of the cysteine residues, suggesting that this domain may be derived from a common ancestral gene. Moreover, with the exception of one substitution, which does not change the coded amino acid, one part of the cDNA sequence we determined is identical to the nucleotide sequence of pSM2-1. This cDNA codes for 196 amino acids in the carboxyl-terminal region of the high molecular weight mucin MG1 isolated from human sublingual gland (37). Considering the expression pattern of MUC5B (31) and the origin of MG1, we can thus conclude that MUC5B encodes MG1.


EXPERIMENTAL PROCEDURES

Screening of cDNA and Genomic Libraries

A lambda gt11 cDNA library constructed from human tracheal mucosa was screened with rabbit antibodies raised to deglycosylated Pronase glycopeptides from bronchial mucins (38). Among the various positive clones obtained, the one designated TH71 and containing a poly(A) tail was of particular interest in the present study.

A human genomic lambda EMBL4 phage library was screened using hybridization with the JER57 probe (25). One positive clone, CEL5, was isolated and studied.

A human placental genomic DNA library in pWE15 cosmid provided by Stratagene was screened using the JER57 probe. Two positive clones, BEN1 and BEN2, were obtained (30). BEN2 was the useful clone in the present study.

Oligonucleotide Primers

Oligonucleotide primers used in PCR, RACE-PCR, RT-PCR, and sequencing experiments were synthesized by Eurogentec (Liège, Belgium). Their sequences and locations are indicated in Table I.

Table I. Primers used for cDNA synthesis and DNA and cDNA sequencing


Primer designation Primer sequence (5' to 3') Position Orientationa

NAU61 ACTCAATGCTCAGGGTTTATTTGC 10582-10605 AS
NAU67 GGGTTTATTTGCAAAACTG 10575-10593 AS
NAU71 AGTGCTGATTGCACACTGCGT 838-859 AS
NAU102 CCTGTCGCAGCTTCCTGGCAG 10446-10466 AS
NAU106 CAGTGAGCATAGGGGAAGCCT 3387-3407 S
NAU127 AGGCTTCCCCTATGCTCACTG 3387-3407 AS
NAU128 CGTGTCCACTGTGTCCTCCTCAGTC 1-25 S
NAU140 GATGGCGGAGGGCTGCTTCTG 5139-5159 S
NAU141 CAGACCGTGTGCACGCAGCAC 1001-1021 AS
NAU142 CCAGGGTAGGACTCCTGAGTG 10246-10266 AS
NAU151 TGAGCAGCGGTTTCAGCAAGA 3168-3188 S
NAU152 CAAGGTTGTGGCACTCAGCAA 3837-3857 AS
NAU196 CGAGGGTTCAGTGTCGGTG 6013-6031 S
NAU200 CAGTGTCCTTACCGGGAGA 2221-2239 AS
NAU203 ATTTAGGAAACCCATCGGGT 5689-5708 AS
NAU207 CGCGGGGTGCCACACACAGGCC 10142-10163 AS
NAU208 GGGTGTAGGTGTGCAGGATGG 9927-9947 AS
NAU219 GCAGGGAAGGGCGCCTGGGAA 7394-7414 AS
NAU226 AGCGGAAGGTGGGACAGCAGT 6620-6640 AS
NAU227 ACTGCTGTCCCACCTTCCGCT 6620-6640 S
NAU232 CTTCCCAGGCGCCCTTCCCTGC 7393-7414 S
NAU233 CTGCGAGACCGAGGTCAACATC 9113-9134 S
NAU234 GATGTTGACCTCGGTCTCGCAG 9113-9134 AS
NAU249 CTCCTCACAGGAGTAGCAGC 8814-8832 AS
NAU277 CAGTGACTGGCGAGGTGCAACTG 3973-3995 S
NAU278 GTATGGGGCCGCATGCGTTGTACACT 4624-4649 AS
NAU280 TGGACAGATGCCCAGGGTTGA 5901-5921 S
NAU281 TGCCATTGTACGAACACAGCT 6776-6796 AS
NAU282 CTGCAGGCCCCATTGGGTCAT 7297-7317 S
NAU293 ATGAGCCGTGGATGGGGTCCC 1195-1215 S
NAU297 TCATGGTCCTGGGCGGCTCCT 5277-5297 AS

a Strand orientation: sense (S), antisense (AS).

3'-End Amplification of cDNAs

The 5'-AmpliFINDER RACE kit (CLONTECH) was used to synthesize first-strand cDNA from human trachea poly(A)+ RNA (1 µg) obtained from CLONTECH using NAU61 as a primer (Table I), followed by ligation of the 5'-ANCHOR adapter. The PCR was then performed using the nested primer NAU67 (Table I and Fig. 1) and the 5'-ANCHOR primer. Nested PCRs involving a second or third round amplification were carried out with 1 µl of the reaction mixture obtained from each previous round of PCR as template.


Fig. 1. Restriction map of the 3' region of MUC5B and sequencing strategy. The fragments indicated were subcloned into pBluescript KS(+) vector. Some primers and their directions are indicated (not to scale) by horizontal arrows and their NAU numbers (their sequences are given in Table I).
[View Larger Version of this Image (10K GIF file)]

RT-PCR Amplification

Total RNA of human gall bladder was extracted as described previously (39). Single-stranded cDNA was performed using the 1st STRAND Synthesis kit (CLONTECH), random hexamers and human trachea poly(A)+ mRNA (0.5 µg) (CLONTECH) or total gall bladder RNA (1 µg). PCR amplification reaction mixtures (50 µl) contain 0.3 mM dNTPs, 2.5 units of Taq DNA polymerase (Boehringer Mannheim), 15 pmol of the appropriate primers, the buffer system purchased with Taq DNA polymerase, and an aliquot of cDNA. The PCR was performed using a Perkin-Elmer Thermal Cycler 480. PCR parameters were 94 °C for 2 min, followed by 30 cycles at 94 °C for 30 s, 60 °C for 1 min, and 72 °C for 2 min, followed by a final extension at 72 °C for 15 min. The amplified products were electrophoresed on a 1% Seaplaque gel (FMC, Rockland, ME) and stained with ethidium bromide. The band was cut out, purified using Preps DNA purification resin (Promega), and subcloned into the T/A cloning vector, pMOSBlue T-vector (Amersham). Thereafter, cDNA clones were subcloned into pBluescript KS(+) vector (Stratagene) using the restriction enzymes (Boehringer Mannheim) PstI, SacI, and/or SmaI. Subclones were sequenced as described below using either universal primers or a series of oligonucleotides specific for both strands of the inserts (Table I).

Isolation and Sequencing of MUC5B Genomic Clones and Sequence Analyses

Fragments of the genomic clones CEL5 and BEN2 corresponding to the region downstream of the central exon were subcloned into pBluescript KS(+) vector as described previously (30). The double-stranded plasmid inserts were sequenced manually using the dideoxynucleotide chain termination method (40) using [alpha -35S]dATP (Amersham) and Sequenase 2.0 (U. S. Biochemical Corp.) according to the protocol indicated by the manufacturer. Universal primers or a series of specific oligonucleotides were used. Sequencing reaction mixtures were electrophoresed on 6% polyacrylamide gel (Sequagel-6TM, National Diagnostics). The clones were sequenced on both strands several times. Direct DNA sequencing on cosmid was performed as described previously (30). Computer analyses were performed using PC/GENE Software. The whole genomic sequence reported in this paper has been submitted to the EMBL Data Bank with accession number [GenBank]. The sequence of TH71 has been submitted to the EMBL Data Bank with accession number [GenBank].

Study of the Intron G

To determine the exact number of repeats in the intron G, first we cut the genomic subcloned BglII-BglII fragment using SacI and RsaI that flank the region containing these direct 59-bp repeats. The complete digestion with SmaI was obtained using 10 units/µg DNA for 3 h. The partial digestions were performed using 1 unit/µg DNA and 0.25 unit/µg DNA for 1 h. After electrophoretic separation on 1.5% agarose gel, the blot analysis was conducted using the antisense oligonucleotide NAU199 (5'- AGAGCCGAGGGGTCTGGG-3'), which had been previously radiolabeled using T4 polynucleotide kinase (Boehringer Mannheim) and [gamma -32P]ATP from Amersham.


RESULTS AND DISCUSSION

Characterization of the Genomic Fragments of the 3' Region of MUC5B

The partial restriction maps of the genomic clones CEL5 and BEN2 were determined. Their overlapping parts present the same restriction map. The partial restriction map of the 3' region of MUC5B is shown in Fig. 1 together with the overlapping fragments, which were separated and subcloned into pBluescript KS(+) vector. The fragments BamHI-SacII and PstI-SacII (in the left part of Fig. 1) contain the 3' end of the central exon. All these clones were entirely sequenced after restriction digestion and subcloning. Primer walking using specific oligonucleotides (Table I) was also performed.

3' Region of MUC5B

Several cDNA-positive clones were obtained by screening the lambda gt11 cDNA library using antibodies as described previously (38). The clone designated TH71 is 380 bp in length. Its sequence (Fig. 2), submitted to the EMBL data bank with accession number [GenBank], revealed a poly(A) tail with 73 A, 16 bp downstream from a polyadenylation signal (AAUAAA). By sequencing the PstI-PstI subclone (noted with an asterisk in Fig. 1) obtained from the fragment NotI-BglII of the BEN2 clone, an identical 67-bp sequence was observed (Fig. 2), up to the A where the poly(A) addition occurs, indicating that the clone BEN2 contains the 3' end of the MUC5B gene. Using the two synthesized oligonucleotides NAU61 and NAU67 chosen in this sequence, a 5'-RACE-PCR experiment was performed. After cloning of the fragment obtained, the insert of 88 bp designated RACE67 was sequenced. This sequence is identical to the 88-bp sequence determined in the PstI-PstI clone (Fig. 2). In contrast, the first 34 nucleotides differ from the sequence of TH71. The TH71 clone, which has been found using the antibodies directed against the repeat part of the MUC5B apomucin (38), begins with a 132-bp sequence we found in the central exon. Between this sequence and the 3' end identical to the RACE67, TH71 seems to have been rearranged; moreover, the following results show that an important part of the cDNA has been lost. We will discuss these data below.


Fig. 2. Alignments of the sequences, genomic or cDNA, containing the polyadenylation signal of MUC5B. The cDNA clone called TH71 was aligned with the sequence of the genomic PstI-PstI fragment (with an asterisk in the right part of Fig. 1), and the clone obtained by performing RACE-PCR called RACE67.
[View Larger Version of this Image (25K GIF file)]

The NotI-BglII fragment from BEN2 contains two other clustered canonical polyadenylation signals, AATAAA. The first was located about 2 kilobase pairs downstream from the first polyadenylation signal and the second 298 base pairs downstream from this latter AATAAA. The significance of these two additional polyadenylation signals is not known. It will be interesting to determine if several forms of MUC5B mRNA can be transcribed by selection of alternative polyadenylation signals.

The dinucleotides TG and GT were found with oligo(T) stretches in the region downstream from the first AATAAA motif within the PstI-PstI subclone. This region, referred to as "GT cluster," is important for 3' processing of polyadenylated mRNAs (41). Moreover, the pentanucleotide CATTG was found between the AATAAA sequence and the poly(A) site addition (Fig. 3). This CAYTG recognition element has been described to be related to cleavage site selection by Berget (42). The author suggested that pre-polyadenylated RNA hybridized with the AAUAAA recognition element as related to primary site selection, and with CAYUG recognition element within the U4 small nuclear ribonucleoproteins as related to cleavage site selection. Hence, MUC5B combines some common features of the 3' mRNA processing. From this nucleotide sequence, the new oligonucleotide NAU102 was synthesized to perform RT-PCR.



Fig. 3. Sequence of the 3' region of the MUC5B gene. The sequence shows the entire region beginning at the sequence of the oligonucleotide NAU128 used for the RT-PCR. The sequences of exons are indicated in uppercase letters, and the sequences of introns are indicated in lowercase letters. Encoded amino acids are shown in single-letter code and numbered 1-845 in the right margin. Bold letters on the right indicate the names of the introns. GC boxes are shaded. The polyadenylation signal is bold boxed. GT stretches and the pentanucleotide CATTG that may be involved in processing or polyadenylation are underlined. The position of the signal of poly(A) addition is marked by an arrow. The splice donor and acceptor sites are in bold.
[View Larger Versions of these Images (63 + 63 + 35K GIF file)]

RT-PCR

Two specific overlapping cDNAs were synthesized by RT-PCR experiments. The locations of the oligonucleotides used in these experiments are indicated in Fig. 1. The oligonucleotide primer NAU151 was designed on the basis of the sequence determined for the BglII-BglII cosmid fragment (Fig. 1). This fragment hybridized with human tracheal RNA on Northern blot and probably contains coding sequences. An amplification product was obtained when the RT-PCR was performed with the two primers NAU151 and NAU102 using human tracheal first-strand cDNA as template. It was designated RT151-102 and is 2209 nucleotides in length. An other RT-PCR was then performed with the following oligonucleotide primers: NAU152, designed with the sequence of RT151-102, and NAU128, chosen in the 3' end sequence of the MUC5B central exon (30). The resultant 1166-bp amplification product, called RT128-152, and the RT151-102 were cloned into pMOS-Blue T-vector. They were subsequently subcloned into pBluescript KS(+) vector after cutting with the restriction enzymes PstI, SacI, and/or SmaI. The subclones were entirely sequenced on both strands several times using T3 and T7 primers and specific oligonucleotides (see in Table I). The two amplification products RT128-152 and RT151-102 have overlapping sequences of 416 nucleotides.

Sequencing Data and Genomic Organization

The 3' region of the human MUC5B gene shown in Fig. 3 encompasses 10,690 bp, of which the first 113 nucleotides correspond to the 3' end of the central exon we recently published (30). The full-length sequence has been submitted to the EMBL data bank with accession number [GenBank].

The 3' region of MUC5B gene is composed of 18 exons ranging in size from 32 to 781 bp (Table II) in good agreement with the mean length of exons (43), in contrast to the extraordinary large central exon of MUC5B (30). The last exon is the largest one. It codes for the 72-amino acid COOH terminus of the core protein and comprises the 3'-untranslated region, 564 bp in length, of the MUC5B gene. The sizes of the 18 introns range from 114 bp to 1118 bp. Each intron begins with a GT and ends with an AG (Table II), obeying strictly the GT/AG rule of splice-junction sequences proposed by Mount (44).

Table II. Characteristics of the exon-intron junctions of the 3' region of the MUC5B gene


Protein domain Exon
5' Splice donor Intron
3' Splice acceptor
No. Size Name Size Class

bp bp
Central 0 10,713 CGCCCGgtgagtgcatgtgga A 679 1 tgttgttcttcacagGGGAAG
MUC11p15 1 182 CGGCAGgtggccccgcctgcc B 283 0 tgttccctgccccagGTGAAT
A3uD4 2 172 GCGAGTgtgagtgcgtcggtg C 1118 1 atctctgtcccgcagGCATCT
D4-like 3 260 GGCCTGgtgagtccaggctgc D 338 0 cctggattcccccagATCCTG
D4-like 4 187 AGTGCGgtgagtgggcggcgg E 159 1 acgccccactcccagGCACCT
D4-like 5 226 GAGCCAgtgagtcctccctcg F 114 2 cgctttgccccacagGGTCTT
D4-like 6 176 TGTGCGgtgagtgggggcggc G 594 1 tctctgtgcccacagACCTCA
D4-like 7 70 CTCTAGgtaagtacagggatg H 443 2 cttgtcatcctgcagGAACCA
D4-like 8 101 CCTGCCgtaagctccgccacc I 463 1 ctgttttctttccagCCTGCG
D4-like 9 32 AAATTTgtgagtggctccacc J 258 0 cttgatccattccagCCCGGG
D4+B-like 10 181 TGTGCGgtaagacgctgcaga K 389 1 tccatcctcccgcagTGTGCA
B+C-like 11 105 GCTGCAgtgagcggggctggg L 126 1 tccctttccttccagGACCTC
C-like 12 38 TACGGGgtaagggcacagcag M 574 0 tcccaccccttgcagGTTGGT
C-like 13 120 CCCCAGgtgagacccaaggca N 696 0 ctcatgtccccacagGGCTTT
C-like 14 87 GTCCAGgtgtaacagcaggca O 120 0 ccttgcatttcagagCTGAAT
C-like 15 123 TGCAGGgtgtgtgctggaggc P 264 0 tttggcccccaccagGGGAGC
C-like 16 43 AGGAGGgtaagtggaagccac Q 232 1 ctttcctcctcccagACTCCT
C-like 17 103 GTCCAAgtgagtgggctcctg R 656 2 accctgcgtccacagGTACTC
CK+3'-UTR
18
781





   C    a ccccccccccc c
Consensus (44)     AGgt agt            n ag
   A    g ttttttttttt t

Sequencing Data and Amino Acid Analyses

The 2423 nucleotides open reading frame (Fig. 3) encodes a 808-amino acid peptide rich in cysteine (10.1%) and proline (9.5%). This region is relatively poor in threonine and serine (8.3 and 7.3%, respectively). It is thus different from a mucin-like domain. The comparison of a part of the deduced peptide sequence of RT151-102 (aa 634-829 in Fig. 3) with the deduced amino acid sequence of the cDNA clone pSM2-1 (37) shows 100% identity. In nucleotide sequence only one codon differs, since the proline in position 769 in our sequence (Fig. 3) is coded by CCC instead of CCG in the sequence of Troxler et al. (37). Consequently, this suggests that pSM2-1 is a part of the MUC5B gene. The pSM2-1 clone was isolated from a human sublingual gland cDNA library, screened with a polyclonal antiserum against deglycosylated MG1, the high molecular weight mucin from human sublingual gland. MG1 is a candidate, among other roles, for participation in enamel pellicle formation (45). MG1 is made up of multiple disulfide-linked subunits and contains numerous hydrophobic binding sites in naked regions with negatively charged amino acid residues (46). These characteristics are in very good agreement with our data, since such regions do exist in the 3570-amino acid peptide encoded by the central exon of MUC5B (30). Seven nonadjacent domains, termed Cys subdomains, have been individualized among the 19 subdomains encoded by the central exon. These Cys subdomains, found in several other apomucins, are richer in Cys (9.3%), Asp (4.9%), and Glu (7.7%) than the 12 other subdomains. The Cys subdomains are poor in Ser and Thr (Ser+Thr: 9.6%) versus tandem repeat domains, termed R domains (Ser+Thr: 52.5%). Moreover, Loomis et al. (46) suggested that aromatic residues of MG1 are buried within the hydrophobic domains. In fact, the Tyr and Trp amino acid residues are strikingly clustered in the Cys subdomains for the MUC5B apomucin. What is more, the MG1 glycoprotein and MUC5B mRNAs are both expressed in salivary glands among other mucosa for MUC5B (31, 37). We can thus conclude that MUC5B encodes the MG1 apomucin.

The deduced amino acid sequence of the carboxyl-terminal region of MUC5B contains 15 consensus sequences for attachment of N-linked oligosaccharides (italic in Fig. 4). Studies were performed using the computer PC Gene software (47). The secondary structure of the carboxyl-terminal region of MUC5B was predicted to contain 62% beta  turn conformation and 13% helix structure located between aa 219 and 250, 407 and 421, 776 and 801. The rest of the structure consists of extended and coil structures. The rigid conformation could be essential for the oligomerization process. In fact 88% of the cysteine residues are located in or near a beta  turn as well as 11 out of the 15 potential N-glycosylation sites. Moreover, a serumalbumin family signature with the consensus sequence YX6CCX7C has also been found between residues 658 and 682. As mucins are well known to bind various hydrophobic substances such as cholesterol, fatty acids, or bilirubin, this small region could be important in the formation of gallstones for example in which mucins have been described to be involved (8, 9, 48, 49).


Fig. 4. Comparison of MUC5B carboxyl-terminal protein with other proteins: vWF (36), MUC2 (16), and L31 (23). Dashes indicate gaps introduced in the sequence for alignment purposes. Cysteine residues are shaded. Potential N-glycosylation sites are indicated by bold italic letters. Black down-pointing arrowheads indicate the positions of the introns of human vWF and are named according to Mancuso et al. (36). Black up-pointing arrowheads indicate the positions of MUC5B introns and are marked with the intron letter according to the Table II. Identical amino acids in vWF, MUC2, L31 (MUC5AC-3' end), and MUC5B are bold boxed. Thin boxes indicate identical amino acids in at least three proteins.
[View Larger Version of this Image (64K GIF file)]

As far as TH71 is concerned, we can now evaluate that more than 2000 bp have been lost in this last cDNA. We were unable to reproduce this cDNA using RT-PCR. It may be concluded that there has been a problem when producing this clone, which has been otherwise of great interest in determining the location of the polyadenylation signal in the genomic DNA.

Deduced Amino Acid Sequence of MUC5B Carboxyl-terminal Region: Comparison with Other Proteins

Some partial alignments with the sequence of vWF have been made by other authors, for example for MUC2 (16) and for MUC5AC-related cDNA clones, like NP3a (22) and L31 (23). Fig. 4 shows that an alignment on longer sequences can be accomplished. The conservation of nearly all of the cysteine residues and of several other amino acids of vWF and of the three 11p15.5 human mucins MUC2, MUC5AC, and MUC5B is readily apparent, suggesting a very similar tertiary structure. The comparison of the deduced carboxyl-terminal MUC5B peptide with vWF especially allows us to dissect this region into six domains: one domain called MUC11p15-type, which follows the central exon, one 56-amino acid domain with similarities to what we called the A3uD4 domain (located between the A3 and the D4 domains in vWF), one D4-like domain, one B-like domain, one C-like domain, and one CK domain of 86 amino acid residues (see Fig. 5A).


Fig. 5. Organization of the 3' region of MUC5B. A, schematic diagram of the carboxyl-terminal MUC5B core protein and similar proteins. B, organization of exons and introns in the 3' region of MUC5B. Exons are indicated by open boxes and numbered consecutively with 0 for the central exon (light gray box). Black box indicates the 3'-untranslated region. Introns as well as 3'-flanking sequence are indicated by lines. Each intron is named with a letter according to Table II.
[View Larger Version of this Image (18K GIF file)]

The domain called MUC11p15-type (Fig. 5A) from aa 38 to 84 in MUC5B follows the central domain described previously (30). It shows similarities to MUC2 and MUC5AC, particularly with regard to the cysteine residues. This domain is somewhat different from the A3 domain of vWF.

The second domain, called A3uD4, also present in vWF, is 69 aa in length in MUC5B and spans aa 85-154. This domain is also found in MUC2 and L31, i.e. MUC5AC.

The vWF-D4 domain was found in MUC5AC, MUC2, and MUC5B (aa 155-533). D4 was found in zonadhesin, a sperm membrane protein that binds in a specific manner to the egg extracellular matrix from pig (50). Moreover, the D4 domain shows similarity to a part of vitellogenin found in nematode Caenorhabditis elegans, chicken, and frog (51). Among the well conserved peptide sequences, of which the positions are indicated for MUC5B, NC(S/T)YVL (aa 180-185), TXGXCGXC (aa 300-307), YAXLC (aa 417-421), CXDWR (aa 427-431), EGCFCP (aa 473-478) found in MUC2, MUC5AC, vWF and MUC5B, the TXGXCGXC octapeptide contains the vicinal cysteine residue motif CGXC, which also exists in vitellogenin and zonadhesin. Mayadas et al. (52) showed that these motifs are similar to the amino acid sequences at the active site of disulfide isomerases that catalyze thiol protein disulfide interchange. These vicinal cysteines may have the capacity to catalyze disulfide interchain formation, but Voorberg et al. (53) indicated that the dimerization resides in the last 151 residues for the vWF. Recently Perez-Vilar et al. (54) validated this hypothesis for PSM, showing that this apomucin can very likely form dimers between its carboxyl-terminal domains. Hence, our sequencing data suggest that the MUC5B apomucin is also able to form dimers between its carboxyl-terminal domains. The presence of disulfide-linked subunits was already predicted by Loomis et al. (46) in MG1. Moreover, Kawagishi et al. reported that MG1 contains at least two subunits (55), one of which is the salivary link component that weakly cross-reacts with antiserum to the human small intestinal link component, which contains N-linked carbohydrate (56).

Following the D4-like domain, one B-like domain of 40 aa residues was found in MUC5B (aa 534-574), MUC2, and MUC5AC instead of the three B domains defined for vWF (36). This B-like domain has not yet been found in other protein sequences.

Instead of the two C domains (C1 and C2) present in vWF, only one C-like domain was found in the three 11p15.5 mucins MUC5B, MUC5AC, and MUC2 (aa 575-740 in MUC5B). One C-like domain was previously reported in the frog integumentary mucin FIM-B.1 (57). This C-like domain has been described to be more related to the C1 than to the C2 domain. In contrast to vWF, in which the homologous C1 and C2 domains arose by duplication, this duplication did not occur in FIM-B.1 (57). In fact, our genomic study shows that the C-like domain of MUC5B is more related to the C2 domain. Extremely conserved intron positions can be shown for introns 45 and L, and for introns 46 and M. Thus C1 and C2 in vWF, and C-like domains in 11p15.5 mucin genes, probably have a common ancestor domain, which has duplicated into C1 and C2 in vWF.

The last domain found in MUC5B is the CK domain (for cystine knot) from aa 741 to 826. The CK domain was also found in the 3' end of the secreted proteins MUC2, MUC5AC, FIM-B.1, and rat-Muc2 (58, 59). The CK domain exists in other secreted proteins (60-62). Eleven cysteine residues and some other amino acid residues within this CK domain are nearly invariant. Molecular modeling of the Norrie disease protein (61) predicts that this domain has a tertiary structure similar to that of transforming growth factor beta  (TGF-beta ). In TGF-beta , seven cysteine residues, corresponding to cysteines 741, 764, 768, 787, 788, 818, and 820 in MUC5B, are nearly invariant (61). Crystallography studies of TGF-beta 2 have shown that six of these cysteine residues are closely grouped to make a rigid structure called the cystine knot (for reviews see 60, 63). Moreover, the determination of the crystal structure of dimeric nerve growth factor and platelet-derived growth factor revealed a structure similar to the one of TGF-beta 2. Tertiary structure similarities probably account for a strong resistance, as suggested by these authors, to heat, denaturants, and extremes of pH. The remaining cysteine residue in each monomer, corresponding in MUC5B to Cys-787, forms an additional disulfide bond that was found to link two TGF-beta 2 monomers into a dimer. Hence, the human mucins MUC5B, MUC5AC, and MUC2, the animal mucins PSM, BSM, rat-Muc2, and FIM-B.1 and Norrie disease protein may be members, with their 11 cysteine residues, of a new CK subfamily.

Out of the 15 consensus sequences for attachment of N-linked oligosaccharides (italic in Fig. 4), 10 sites are close to those observed in the 3' ends of MUC2, and/or MUC5AC (L31) and/or vWF. Four of them (positions aa 179, 298, 299, and 569 in MUC5B) have the same positions in the deduced peptides of the three human mucin genes mapped on chromosome 11p15.5. One site has the same position in the three mucins and in vWF (aa 2223 in vWF and aa 463 in MUC5B). In addition to the typical and expected O-glycosylation that occurs in MUC5B, it is very tempting to speculate that this apomucin, synthesized in the endoplasmic reticulum, is rapidly N-glycosylated. The polypeptide might fold to form intramolecular interactions and then dimers through intermolecular disulfide bridges within the carboxyl-terminal region. Although previous studies on bovine vWF suggested that N-glycosylation is not necessary for dimerization (64), Wagner et al. (65) more recently reported that N-linked carbohydrate addition onto human vWF is important for dimerization. In contrast, Perez-Vilar et al. (54) demonstrated that PSM dimerization is not dependent on the N-linked oligosaccharides within its carboxyl-terminal domain. Further studies on MUC5B apomucin using recombinant proteins synthesis to obtain antibodies and culture of mucus-secreting cells such as HT29-MTX in presence of tunicamycin will be required to clarify the role of N-glycosylation.

Sequence Analyses of the Introns

The schematic organization of the carboxyl-terminal MUC5B gene is given in Fig. 5B. No alternative splicing was found using total RNA from gall bladder or poly(A)+ mRNA from trachea and the following pairs of primers: NAU128/NAU152, NAU151/NAU203, NAU140/NAU219, NAU227/NAU208, and NAU233/NAU67 (Table I).

Introns A, C, E, G, I, K, L, and Q are class 1, where each intron interrupts the coding sequence between the first and second bases of the codon (66). Introns F, H, and R are class 2 (the intron interrupts the second and the third bases of the codon), and introns B, D, J, M, N, O, and P are class 0 (the intron occurs between codons). Introns B, C, E, G, J, K, L, M, and N have the same positions as the introns 33, 34, 36, 37, 38, 40/41, 45, 46, or 43/47, respectively, in vWF (Fig. 4). It must be emphasized that introns C and 34, E and 36, G and 37, K and 40 or 41, L and 45 are class 1, while introns J and 38, M and 46, and N and 43 or 47 are class 0. We can then observe that the ORFs between the symmetrical introns C and E, between E and G, between G and K, between K and L, between J and M and between M and N have flanking introns of the same phase class at both their ends. Consequently these ORFs are good candidates for exon shuffling especially when both ORF flanking introns are class 1 (67). It would be interesting to determine if MUC2, MUC5AC, and MUC5B have a common 3' end gene organization. Then it would be proposed that exon shuffling mechanisms may have played an important role in the formation of genes coding for proteins with D/B/C/CK domains, while a single ancestral gene may have evolved by successive duplications to give rise to the 11p15.5 human mucin gene family. Clearly, much work remains to be done and new data have to be collected concerning the three other 11p15.5 mucin genes MUC2, MUC5AC, and MUC6 to confirm our hypothesis; in particular, the exon-intron repartitions have to be elucidated.

In some introns, unique tandemly repeated sequences that are more or less perfect are found: 23 copies of an imperfect 20-bp repeat in intron A, 11 copies of an imperfect 10-bp repeat in intron C, 9 copies of a perfect 59-bp repeat in intron G, and 12 copies of an imperfect 20-bp repeat in intron P. Searching of the GenBank data base indicated that the consensus sequences of these four distinct repeats were not identical with any registered sequence. It is striking that intron G is 75% (G+C)-rich and it is almost entirely built up of copies of a perfect 59-bp repeat, CCTGTGCGGTGAGTGGGGGCGGCCCCGGGCCCCCCAGACCCCTCGGCCTCTCTGAGTGT. Each repeat contains one GC box binding site and one SmaI enzyme recognition site. The first copy of this repeat begins in the 3' end of the exon 6. To determine the exact number of repeats in this intron, first we cut the genomic subcloned BglII-BglII fragment using SacI and RsaI enzymes that flank the region containing these perfect 59-bp repeats. Then we performed a complete and two partial restriction digestions with SmaI (for details see "Experimental Procedures"). NAU199 is an oligonucleotide that recognizes a part of the 59-bp repeat. The results shown in Fig. 6 led us to conclude that there are nine 59-bp repeats. In a previous study where single-stranded oligonucleotides were used, we found that a nuclear factor called NF1-MUC5B (68), extracted from the colonic mucus-secreting subclone HT29-MTX, binds this GC site. This factor, with a Mr of 42,000, has been demonstrated not to be Sp1. Biochemical studies are currently in progress in our laboratory to characterize this nuclear factor.


Fig. 6. Characterization of the 59-bp repeat in the intron G. A, partial restriction map of the BglII-BglII fragment from cosmid genomic clone BEN2. B, electrophoretic separation in agarose gel. The BglII-BglII fragment containing the intron G was cut with SacI and RsaI that flank the intron G and thereafter partially digested using SmaI. Blot analysis was conducted using the radiolabeled oligonucleotide NAU199. Lane 1, BglII-BglII fragment totally digested with SmaI; lane 2, SacI-RsaI fragment partially digested for 1 h with SmaI (1 unit/µg DNA); lane 3, SacI-RsaI fragment partially digested for 1 h with SmaI (0.25 unit/µg DNA); lane 4, SacI-RsaI fragment.
[View Larger Version of this Image (27K GIF file)]

In summary, we have cloned and sequenced the whole genomic 3' region of MUC5B and defined the exon-intron repartition. We have proved that this gene codes for the high molecular weight salivary mucin MG1. MUC5B is expressed essentially at high levels in acini mucous cells of salivary and respiratory submucosal glands and in epithelial cells of gall bladder, endocervix, and pancreas. This study provides the first genomic organization of the 3' region for a large size secreted gel-forming mucin gene. Our recent work showed that the central domain of MUC5B is encoded by a single large exon (10,713 bp), the largest one described to date in vertebrates. The deduced protein contains 19 subdomains. Some of them show similarity to each other, creating repeat units called super-repeats of 528 amino acid residues, which are the biggest ever determined in mucin genes. Each super-repeat comprises a 108-amino acid cysteine-rich subdomain. This last subdomain, found seven times in MUC5B, has thus been found several times in at least three of the four human mucin genes mapped to 11p15.5. (30). It seems that 11p15.5 human mucin genes are characterized by (i) a large exon encoding the repetitive domain as demonstrated for MUC5B and as suggested by Toribara et al. for MUC2 (15), (ii) the presence of Cys subdomains with 10% Cys residues (30), and (iii) a unique sequence just downstream from the repetitive domain typical of the 11p15.5 mucin genes and a cysteine-rich region, which is divided in several subdomains similar to vWF-D4, vWF-B, vWF-C, and CK domains (Fig. 4). It will be interesting to determine if the three other mucin genes MUC2, MUC5AC, and MUC6 have the same 3' end genomic organization as MUC5B. Moreover, it is clear with our previously published data (21, 29, 30) and with our present results, that MUC5AC and MUC5B are two distinct genes; therefore, it would be preferable for all authors to be precise in specifying which gene is concerned when they write MUC5.


FOOTNOTES

*   This work was supported by La Ligue contre le Cancer, L'Association de Recherche contre le Cancer, and L'Association François Aupetit.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) Y10080[GenBank] and Y09788[GenBank].


   To whom correspondence should be addressed: INSERM U-377, 59045 Lille Cedex, France. Tel.: 33-3-20-29-88-59; Fax: 33-3-20-53-85-62; E-mail: laine{at}lille.inserm.fr.
1   The abbreviations used are: bp, base pair(s); aa, amino acid(s); BSM, bovine submaxillary gland mucin-like; CK, cystine knot; FIM, frog integumentary mucin; PCR, polymerase chain reaction; PSM, porcine submaxillary mucin; TGF, transforming growth factor; RACE, rapid amplification of cDNA ends; RT, reverse transcription; vWF, von Willebrand factor; ORF, open reading frame.

ACKNOWLEDGEMENTS

We are grateful for the technical assistance of Michel Crépin, Evelyne Destailleur, Viviane Mortelec, and Danièle Petitprez.


Note Added in Proof

While this manuscript was in review, Nielsen, P. A., Bennett, E. P., Wandall, H. H., Therkildsen, M. H., Hannibal, J., and Clausen, H. ((1997) Glycobiology 7, 413-419) identified MG1 as tracheobronchial mucin MUC5B. On the other hand, Keates, A. C., Nunes, D. P., Afdhal, N. H., Troxler, R. F., and Offner, G. D. ((1997) Biochem. J. 324, 295-303) published a partial genomic organization of the 3' end of MUC5B with some differences from our data.


REFERENCES

  1. Gendler, S. J., and Spicer, A. P. (1995) Annu. Rev. Physiol. 57, 607-634 [CrossRef][Medline] [Order article via Infotrieve]
  2. Bansil, R., Stanley, E., and LaMont, J. T. (1995) Annu. Rev. Physiol. 57, 635-657 [CrossRef][Medline] [Order article via Infotrieve]
  3. Forstner, G. (1995) Annu. Rev. Physiol. 57, 585-605 [CrossRef][Medline] [Order article via Infotrieve]
  4. Verma, M. (1994) Cancer Biochem. Biophys. 14, 151-162 [Medline] [Order article via Infotrieve]
  5. Ho, S. B., Roberton, A. M., Shekels, L. L., Lyftogt, C. T., Niehans, G. A., and Toribara, N. W. (1995) Gastroenterology 109, 735-747 [Medline] [Order article via Infotrieve]
  6. Buisine, M. P., Janin, A., Maunoury, V., Audié, J. P., Delescaut, M. P., Copin, M. C., Colombel, J. F., Degand, P., Aubert, J. P., and Porchet, N. (1996) Gastroenterology 110, 84-91 [Medline] [Order article via Infotrieve]
  7. Kaliner, M., Shelhamer, J. H., Borson, B., Nadel, J., Patow, C., and Marow, Z. (1986) Am. Rev. Respir. Dis. 134, 612-621 [Medline] [Order article via Infotrieve]
  8. Lee, S. P., LaMont, J. T., and Carey, M. C. (1981) J. Clin. Invest. 67, 1712-1723 [Medline] [Order article via Infotrieve]
  9. Klinkskpoor, J. H., Tytgat, G. N. J., and Groen, A. K. (1993) Eur. J. Gastroenterol. Hepatol. 5, 226-234
  10. Lan, M. S., Batra, S. K., Qi, W. N., Metzgar, R. S., and Hollingsworth, M. A. (1990) J. Biol. Chem. 265, 15294-15299 [Abstract/Free Full Text]
  11. Lancaster, C. A., Peat, N., Duhig, T., Wilson, D., Taylor-Papadimitriou, J., and Gendler, S. J. (1990) Biochem. Biophys. Res. Commun. 173, 1019-1029 [Medline] [Order article via Infotrieve]
  12. Bobek, L. A., Tsai, H., Biesbrock, A. R., and Levine, M. J. (1993) J. Biol. Chem. 268, 20563-20569 [Abstract/Free Full Text]
  13. Bobek, L. A., Liu, J., Sait, S. N. J., Shows, T. B., Bobek, Y. A., and Levine, M. J. (1996) Genomics 31, 277-282 [CrossRef][Medline] [Order article via Infotrieve]
  14. Gum, J. R., Byrd, J. C., Hicks, J. W., Toribara, N. W., Lamport, D. T. A., and Kim, Y. S. (1989) J. Biol. Chem. 264, 6480-6487 [Abstract/Free Full Text]
  15. Toribara, N. W., Gum, J. R., Culhane, P. J., Lagace, R. E., Hicks, J. W., Petersen, G. M., and Kim, Y. S. (1991) J. Clin. Invest. 88, 1005-1013 [Medline] [Order article via Infotrieve]
  16. Gum, J. R., Jr., Hicks, J. W., Toribara, N. W., Rothe, E. M., Lagace, R. E., and Kim, Y. S. (1992) J. Biol. Chem. 267, 21375-21383 [Abstract/Free Full Text]
  17. Gum, J. R., Jr., Hicks, J. W., Toribara, N. W., Siddiki, B., and Kim, Y. S. (1994) J. Biol. Chem. 269, 2440-2446 [Abstract/Free Full Text]
  18. Gum, J. R., Hicks, J. W., Swallow, D. M., Lagace, R. E., Byrd, J. C., Lamport, D. T. A., Siddiki, B., and Kim, Y. S. (1990) Biochem. Biophys. Res. Commun. 171, 407-415 [Medline] [Order article via Infotrieve]
  19. Porchet, N., Nguyen, V. C., Dufossé, J., Audié, J. P., Guyonnet Dupérat, Gross, M. S., Denis, C., Degand, P., Bernheim, A., and Aubert, J. P. (1991) Biochem. Biophys. Res. Commun. 175, 414-422 [Medline] [Order article via Infotrieve]
  20. Aubert, J. P., Porchet, N., Crépin, M., Duterque-Coquillaud, M., Vergnes, G., Mazzuca, M., Debuire, B., Petitprez, D., and Degand, P. (1991) Am. J. Respir. Cell. Mol. Biol. 5, 178-185 [Medline] [Order article via Infotrieve]
  21. Guyonnet Dupérat, V., Audié, J. P., Debailleul, V., Laine, A., Buisine, M. P., Galiegue-Zouitina, S., Pigny, P., Degand, P., Aubert, J. P., and Porchet, N. (1995) Biochem. J. 305, 211-219 [Medline] [Order article via Infotrieve]
  22. Meerzaman, D., Charles, P., Daskal, E., Polymeropoulos, M. H., Martin, B. M., and Rose, M. C. (1994) J. Biol. Chem. 269, 12932-12939 [Abstract/Free Full Text]
  23. Lesuffleur, T., Roches, F., Hill, A. S., Lacasa, M., Fox, M., Swallow, D. M., Zweibaum, A., and Real, F. X. (1995) J. Biol. Chem. 270, 13665-13673 [Abstract/Free Full Text]
  24. Klomp, L. W. J., Van Rens, L., and Strous, G. J. (1995) Biochem. J. 308, 831-838 [Medline] [Order article via Infotrieve]
  25. Dufossé, J., Porchet, N., Audié, J. P., Guyonnet Dupérat, V., Laine, A., Van Seuningen, I., Marrakchi, S., Degand, P., and Aubert, J. P. (1993) Biochem. J. 293, 329-337 [Medline] [Order article via Infotrieve]
  26. Toribara, N. W., Roberton, A. M., Ho, S. B., Kuo, W. L., Gum, E., Hicks, J. W., Gum, J. R., Jr., Byrd, J. C., Siddiki, B., and Kim, Y. S. (1993) J. Biol. Chem. 268, 5879-5885 [Abstract/Free Full Text]
  27. Nguyen, V. C., Aubert, J. P., Gross, M. S., Porchet, N., Degand, P., and Frézal, J. (1990) Hum. Genet. 86, 167-172 [Medline] [Order article via Infotrieve]
  28. Griffiths, B., Matthews, D. J., West, L., Attwood, J., Povey, S. M., Swallow, D. M., Gum, J. R., and Kim, Y. S. (1990) Ann. Hum. Genet. 54, 277-285 [Medline] [Order article via Infotrieve]
  29. Pigny, P., Guyonnet Dupérat, V., Hill, A. S., Pratt, W. S., Galiegue-Zouitina, S., Collyn D'Hooge, M., Laine, A., Van Seuningen, I., Gum, J. R., Kim, Y. S., Swallow, D. M., Aubert, J. P., and Porchet, N. (1996) Genomics 38, 340-352 [CrossRef][Medline] [Order article via Infotrieve]
  30. Desseyn, J. L., Guyonnet Dupérat, V., Porchet, N., Aubert, J. P., and Laine, A. (1997) J. Biol. Chem. 272, 3168-3178 [Abstract/Free Full Text]
  31. Audié, J. P., Janin, A., Porchet, N., Copin, M. C., Gosselin, B., and Aubert, J. P. (1993) J. Histochem. Cytochem. 41, 1479-1485 [Abstract/Free Full Text]
  32. Audié, J. P., Tetaert, D., Pigny, P., Buisine, M. P., Janin, A., Aubert, J. P., Porchet, N., and Boersma, A. (1995) Hum. Reprod. 10, 98-102 [Abstract]
  33. Balagué, C, Gambus, G., Carrato, C., Porchet, N., Aubert, J. P., Kim, Y. S., and Real, F. X. (1994) Gastroenterology 106, 1056-1061
  34. Campion, J. P., Porchet, N., Aubert, J. P., L'Helgoualc'h, A., and Clément, B. (1995) Hepatol. 21, 223-231 [Medline] [Order article via Infotrieve]
  35. Balagué, C., Audié, J. P., Porchet, N., and Real, F. X. (1995) Gastroenterology 109, 953-964 [Medline] [Order article via Infotrieve]
  36. Mancuso, D. J., Tuley, E. A., Westfield, L. A., Worrall, N. K., Shelton-Inloes, B. B., Sorace, J. M., Alevy, Y. G., and Sadler, J. E. (1989) J. Biol. Chem. 264, 19514-19527 [Abstract/Free Full Text]
  37. Troxler, R. F., Offner, G. D., Zhang, F., Iontcheva, I., and Oppenheim, G. O. (1995) Biochem. Biophys. Res. Commun. 217, 1112-1119 [CrossRef][Medline] [Order article via Infotrieve]
  38. Crépin, M., Porchet, N., Aubert, J. P., and Degand, P. (1991) Biorheology 27, 471-484
  39. Glisin, V., Orkvenjakov, R., and Byus, C. (1974) Biochemistry 13, 2633-2637 [Medline] [Order article via Infotrieve]
  40. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 5463-5467 [Abstract]
  41. Birnstiel, M. L., Busslinger, M., and Strub, K. (1985) Cell 41, 349-359 [Medline] [Order article via Infotrieve]
  42. Berget, S. M. (1984) Nature 309, 179-181 [Medline] [Order article via Infotrieve]
  43. Hawkins, J. D. (1988) Nucleic Acids Res. 16, 9893-9905 [Abstract]
  44. Mount, S. M. (1982) Nucleic Acids Res. 10, 459-472 [Abstract]
  45. Tabak, L. A., Levine, M. J., Mandel, I. D., and Ellison, S. A. (1982) J. Oral Pathol. 11, 1-17 [Medline] [Order article via Infotrieve]
  46. Loomis, R. E., Prakobphol, A., Levine, M. J., Reddy, M. S., and Jones, P. C. (1987) Arch. Biochem. Biophys. 258, 452-464 [Medline] [Order article via Infotrieve]
  47. Chou, P. Y., and Fasman, G. D. (1974) Biochemistry 13, 211-249 [Medline] [Order article via Infotrieve]
  48. Smith, B. F., and LaMont, J. T. (1984) J. Biol. Chem. 259, 12170-12177 [Abstract/Free Full Text]
  49. Smith, B. F., and LaMont, J. T. (1985) J. Clin. Invest. 76, 439-445 [Medline] [Order article via Infotrieve]
  50. Hardy, D. M., and Garbers, D. L. (1995) J. Biol. Chem. 270, 26025-26028 [Abstract/Free Full Text]
  51. Baker, M. E. (1988) Biochem. J. 256, 1059-1063 [Medline] [Order article via Infotrieve]
  52. Mayadas, T. N., and Wagner, D. D. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 3531-3535 [Abstract]
  53. Voorberg, J., Fontijn, R., Calafat, J., Janssen, H., van Mourik, J. A., and Pannekoek, H. (1991) J. Cell Biol. 113, 195-205 [Abstract]
  54. Perez-Vilar, J., Eckhardt, A. E., and Hill, R. L. (1996) J. Biol. Chem. 271, 9845-9850 [Abstract/Free Full Text]
  55. Kawagishi, S., Fahim, R. E. F., Wong, K. H., and Bennick, A. (1990) Arch. Oral Biol. 35, 265-272 [Medline] [Order article via Infotrieve]
  56. Roberton, A. M., Mantle, M., Fahim, R. E. F., Specian, R. D., Bennick, A., Kawagishi, S., Sherman, P., and Forstner, J. F. (1989) Biochem. J. 261, 637-647 [Medline] [Order article via Infotrieve]
  57. Probst, J. C., Gertzen, E. M., and Hoffmann, W. (1990) Biochemistry 29, 6240-6244 [Medline] [Order article via Infotrieve]
  58. Hoffmann, W., and Joba, W. (1995) Biochem. Soc. Trans. 23, 200-205
  59. Van Klinken, B. J. W., Dekker, J., Büller, H. A., and Einerhand, A. W. C. (1995) Am. J. Physiol. 269, G613-G627 [Abstract/Free Full Text]
  60. Sun, P. D., and Davies, D. R. (1995) Annu. Rev. Biophys. Biomol. Struct. 24, 269-91 [CrossRef][Medline] [Order article via Infotrieve]
  61. Meitinger, T., Meindl, A., Bork, P., Rost, B., Sander, C., Haasemann, M., and Murken, J. (1993) Nat. Genet. 5, 376-380 [Medline] [Order article via Infotrieve]
  62. Hoffmann, W., and Hauser, F. (1993) Comp. Biochem. Physiol. 105B, 465-472
  63. Kingsley, D. M. (1994) Genes Dev. 8, 133-146 [CrossRef][Medline] [Order article via Infotrieve]
  64. Lynch, D. C., Williams, R., Zimmerman, T. S., Kirby, E. P., and Livingston, D. M. (1983) Proc. Natl. Acad. Sci. U. S. A. 80, 2738-2742 [Abstract]
  65. Wagner, D. D. T, Mayadas, T., and Marder, V. J. (1986) J. Cell Biol. 102, 1320-1324 [Abstract]
  66. Patthy, L. (1987) FEBS Lett. 214, 1-7 [CrossRef][Medline] [Order article via Infotrieve]
  67. Bork, P. (1992) Curr. Opin. Struct. Biol. 4, 383-392
  68. Pigny, P., Van Seuningen, I., Desseyn, J. L., Nollet, S., Porchet, N, Laine, A., and Aubert, J. P. (1996) Biochem. Biophys. Res. Commun. 220, 186-191 [CrossRef][Medline] [Order article via Infotrieve]

©1997 by The American Society for Biochemistry and Molecular Biology, Inc.