©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Characterization of a Mucin cDNA Clone Isolated from HT-29 Mucus-secreting Cells
THE 3` END OF MUC5AC? (*)

Thécla Lesuffleur (1) (2)(§), Ferran Roche (2), Alexander S. Hill (3)(¶), Michel Lacasa (1) (4), Margaret Fox (3), Dallas M. Swallow (3), Alain Zweibaum (1), Francisco X. Real (2)

From the (1) From INSERM U178, Btiment INSERM, 16 avenue Paul Vaillant-Couturier, 94807 Villejuif Cedex, France, (2) Institut Municipal d'Investigació Mèdica, Universitat Autonoma de Barcelona, carrer Doctor Aiguader, 80, E-08003 Barcelona, Spain, (3) Medical Research Council Human Biochemical Genetics Unit, The Galton Laboratory, University College London, 4 Stephenson Way, London NW1 2HE, United Kingdom, and (4) Université Pierre et Marie Curie, 4 place Jussieu, 75005 Paris, France

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

HT-29 cells resistant to 10M methotrexate (HT29-MTX) secrete mucins with gastric immunoreactivity (Lesuffleur, T., Barbat, A., Dussaulx, E., and Zweibaum, A.(1990) Cancer Res. 50, 6334-6343). A 3310-base pair mucin cDNA clone (L31) was isolated from an HT29-MTX expression library using a polyclonal serum specific for normal gastric mucosa. It shows a high level of identity (98.6%) to clone NP3a isolated from a nasal polyp cDNA library (Meerzaman, D., Charles, P., Daskal, E., Polymeropoulos, M. H., Martin, B. M., and Rose, M. C.(1994) J. Biol. Chem. 269, 12932-12939). However, as a result of changes in reading frame, the 1042-amino acid deduced peptide contains four regions of a low similarity to the NP3a peptide. The amino acid sequence shows 36.3% similarity to part of the carboxyl-terminal sequence of MUC2 including the so-called D4 domain and 21.3% to the pro von Willebrand factor. A short amino acid sequence is similar to cysteine-rich sequences repeated in tracheobronchial, gastric, and colonic mucin cDNAs. The gene corresponding to L31 is located in the mucin gene cluster on chromosome 11p15.5. The patterns of mRNA expression were indistinguishable from those revealed with the JER58 probe (MUC5AC). Southern blot analysis indicates that the L31 and JER 58 sequences are within 20 kilobase pairs of each other. Together, these results suggest that L31 clone is the 3` end of MUC5AC.


INTRODUCTION

Mucins are highly O-glycosylated proteins, which polymerize through the formation of disulfide bonds to form a gel that is secreted onto the surface of glandular epithelia. In the last 10 years, it has become clear that mucin polypeptide backbones are encoded by several genes (see Refs. 1 and 2 for reviews). The complete sequences of MUC1 (3, 4, 5) , MUC2 (6, 7, 8) , and MUC7 cDNAs (9) have been reported, but only partial sequences are available for MUC3 (10) , MUC4 (11) , MUC5AC (12, 13) , MUC5B (14) , and MUC6 (15) . The initial identification of these genes was allowed by screening of expression libraries with polyclonal antisera prepared against deglycosylated mucins from a variety of sources. A notable feature of these clones was that they contained tandem repeats of sequences coding for Thr- and Ser-rich regions, which appear to correspond to the highly glycosylated domains of mucins (1) . In the case of MUC2, these regions are flanked by ``unique'' sequences, which encode Cys-rich domains (7, 8) that are homologous to the pro von Willebrand factor D domains (8, 16) . Recently, there have been a number of other reports of partial mucin cDNAs encoding Cys-rich regions (13, 17) .

Although several mucin genes appear to be expressed in the epithelium of each tissue examined (1, 18, 19, 20) , there are marked differences in the pattern of expression and also in the relative expression of the genes within different cell types in a single tissue (1, 18, 20, 21, 22) . Furthermore, changes in expression are observed in epithelial tumors and in non-neoplastic disease states (1) . As a tool to understanding these changes and in order to obtain information about the regulation of these genes, we have been developing cell culture models (23, 24) . We have isolated and characterized two subpopulations of the colon cancer cell line HT-29 by means of selection using 5-fluorouracil (FU)()(24) and methotrexate (MTX) (23) . HT29-FU cells express colonic-type mucins as judged by antibody analysis, while HT29-MTX cells express gastric-type mucins, a situation analogous to that of fetal colon (23, 24) . HT29-MTX cells form a homogeneous population of mucus-secreting cells and thus potentially provide a good model for the study of mucin biosynthesis (25) . We have shown that the MUC genes (MUC1-MUC6) are expressed to different extents in these cells (25, 26) , but it is not known whether the major mucin gene expressed is the product of one of them.

In view of the gastric immunoreactivity of the mucins produced by these cells, we have used an antiserum raised against gastric mucosal fraction containing native mucins to screen a cDNA library prepared from HT29-MTX cells. We have characterized a cDNA, and here we present evidence that it may be the 3` end of the MUC5AC gene. We discuss the relationship ofthis clone to that recently described by Meerzaman et al.(17) , which was isolated from a nasal polyp cDNA library.


MATERIALS AND METHODS

Cell Culture

HT29-MTX and HT29-FU cell populations resulted from adaptation of HT-29 cell line to 10M methotrexate (23) and 10M 5-fluorouracil (24) , respectively, and were used after 3 or 4 weekly passages in the absence of drug. These cells were maintained under the same conditions of culture as described previously (23, 24) .

Normal Tissues

A panel of 30 normal adult stomachs and colons from irreversibly brain-damaged organ donors was obtained with the help of France Transplant Association, according to protocols approved by the National Ethical Committee. Samples of mucosa were snap-frozen in liquid nitrogen and stored in liquid nitrogen until used.

Preparation of a Polyclonal Antiserum Raised against Normal Gastric Mucosa

The antiserum used was prepared 20 years ago by the same protocol used for the characterization of the so-called WZ polymorphic colon antigens (27, 28) . Normal stomach and colon from the same blood group O donor were used. The stomach mucosa was scraped, homogenized in distilled water (Ultra-turrax, Jankee and Kunkel, Stauffer, Germany), and centrifuged (1 h, 48,000 g). The supernatant was heated for 1 h in a boiling water bath and further centrifuged (1 h, 48,000 g). The resultant supernatant was dialyzed against distilled water for 2 days at 4 °C and lyophilized (crude gastric extract). Scraped colonic mucosa was homogenized in acetone, dried at room temperature, and then passed through a sieve (200-mesh) in order to obtain a thin dry powder. One rabbit (L56) was immunized with the crude gastric extract in Freund's complete adjuvant according to the protocol previously reported (27) . The immunized rabbit was of the A phenotype (29) in order to avoid the presence of anti-blood group A antibodies. The immune serum was further absorbed with the colonic mucosal dry acetone powder to remove non-gastric-specific antibodies and stored in aliquots at -20 °C. The absorbed immune serum is referred to as L56/C.

Immunological Assays

Indirect immunofluorescence was performed on frozen cryostat sections of tissues and HT29-MTX cells as described earlier (23) .

For ELISA, microtiter plates (Dynatech, Chantilly, VA) were coated with native or deglycosylated mucins isolated from the culture medium of HT29-MTX cells (20 µg/ml), KLH, and peptide-KLH conjugates (20 µg/ml in PBS). Native and deglycosylated MTX mucins were prepared as described elsewhere (21) . The peptides tested correspond to MUC1 (VTSAPDTRPAPGSTAPPAHG), MUC2 (PTTTPISTTTVTPTPTPTGTQT), MUC3 (HSTPSFTSSITTTETTS), MUC4 (TSSASTGHATPLPVTD), MUC5AC (TTSTTSAPTTS), and MUC6 (SFQTTTTYPTPSHPQTTLP and TSLHSHTSSTHHPEVPT) tandem repeat sequences. MUC5B was not included because it does not contain a conserved tandem repeat. The peptide synthesis and the coupling to KLH were performed as described previously (21) . Reactive sites were blocked with 1% gelatin for 1 h at 37 °C. Serial dilutions of the L56/C serum in PBS containing 1% BSA were incubated for 1 h at 37 °C in the antigen-coated wells. After washing with PBS-Tween, alkaline phosphatase-conjugated swine anti-rabbit IgG (Dako, Glostrup, Denmark) (0.5 µg/ml) was added for 1 h at 37 °C. Alkaline phosphatase activity was detected using 4-methylumbelliferyl-phosphate (Boehringer, Mannheim, Germany) at 1 mg/ml diluted in triethanolamine buffer, pH 9.5, and fluorescence was measured using a CytoFluor 2300 fluorescence measurement system (Millipore, Bedford, MA).

Preparation and Screening of a HT29-MTX cDNA Library

Poly(A) RNA was extracted from postconfluent HT29-MTX cells (day 21 after seeding) as described previously (23) . A ZAP-cDNA library was constructed using 5 µg of mRNA as template and amplified once according to instructions in the ZAP-cDNA® synthesis kit (Stratagene, La Jolla, CA) and was kindly provided by E. Navarro and T. Adell (Institut Municipal d'Investigació Mèdica, Barcelona, Spain). The host strain was Escherichia coli XL1 blue. The library was plated in soft agar at a density of 30,000 plaque-forming units/150-mm plate. Plates were incubated at 37 °C until plaques began to appear, then overlaid with isopropyl-D-thiogalactopyranoside-saturated nitrocellulose membranes and incubated for an additional 3 h. The filters were removed and blocked with 3% BSA. For the immunoscreening, L56/C antibodies were used at a 1:50 dilution and preabsorbed for 1 h with 1% BSA and XL1 blue cells, lysed or killed by UV light (each XL1 blue preparation corresponding to a pellet of 50 ml overnight culture for 50 ml of diluted serum). After primary incubation with absorbed L56/C serum for 3 h at room temperature and washing, filters were incubated with alkaline phosphatase-conjugated swine anti-rabbit IgG diluted 1:1000 (Dako). After washing, positive plaques were identified by detection of alkaline phosphatase activity with nitro blue tetrazolium/5-bromo-4-chloro-3-indolyl phosphate alkaline phosphatase buffer (Promega, Madison, WI) and purified by successive rounds of screening. Using helper phage R408, the pBluescript® plasmids containing positive cDNA were released from ZAP phage, and transferred into XL1 blue for storage and sequencing.

cDNA Sequencing

The clone L31 was sequenced in its entirety on both strands using the dideoxynucleotide chain termination method while the other clones were partially sequenced. Sequencing was performed with double-stranded plasmid and T7 DNA polymerase kit, according to the recommended procedure (Pharmacia, Saint Quentin en Yvelines, France). The primers used were the T3 sequence and a 20-mer oligonucleotide (5`-CCCAAAAGGGTCAGTGCTGC-3`) in pBluescript for the 5` and 3` end sequences, respectively, and 11 pairs of oligonucleotides (both sense and antisense) corresponding to positions 237-255, 464-484, 821-840, 1155-1173, 1413-1431, 1674-1693, 1906-1925, 2184-2203, 2459-2478, 2698-2717, and 2954-2973 in the L31 sequence. The oligonucleotides were synthesized on an Applied Biosystems Inc. PCR-mate or purchased from Bioprobe (Bioprobe, Montreuil sous Bois, France). The sequencing difficulties due to the high (G+C) concentration of the insert were solved by the use of 7-deaza-GTP (Pharmacia). Enzymatic pauses were suppressed using a lower quantity of DNA template (1 µg/reaction). Analysis of nucleotide and amino acid sequence data was performed using Bisance (30) and the Wisconsin GCG package on the HGMP-RC computer (31) .

Chromosomal Localization

Fluorescent in situ hybridization to metaphase chromosomes was conducted as described previously (32, 33) , using 200 ng of plasmid DNA from clone L31 in pBluescript biotinylated by nick translation.

Northern Blot Analysis

Total RNA was extracted from frozen samples of normal human tissues as described previously (34) . RNA (15 µg) was denatured in loading buffer containing formamide and formaldehyde, fractionated by 1% agarose gel electrophoresis, and transferred onto nitrocellulose (Schleicher & Schuell, Dassel, Germany).

Poly(A) RNA isolation from HT29-MTX and HT29-FU cells, electrophoretic separation, and Northern blotting were as reported previously (25) .

All membranes were prehybridized and hybridized according to previously used conditions (25) . MUC2 was detected using SMUC41 (6) , MUC5AC with JER58 (12) , and MUC5B with JER 57 (14) . Actin was detected with cDNA probe pA2 (35) .

Southern Blot Analysis

Southern blot analysis of human genomic DNAs using Hybond N filters (Amersham) was carried out following standard procedures, as recommended by the manufacturer. The final stringent wash was done with 0.1 SSC at 65 °C. The relative sizes of the fragments were determined by comparison with Raoul markers (Appligene, Durham, United Kingdom), which were probed with P-labeled PUC18 for their detection.


RESULTS

Reactivity of Antiserum L56/C

The antiserum L56/C was tested by indirect immunofluorescence on cryostat sections of the panel of normal gastric and colonic mucosa samples. The antiserum showed strong reactivity with mucus droplets of mucus-secreting cells in all normal gastric samples tested (Fig. 1a). In contrast no reactivity was observed with colonic goblet cells (Fig. 1b). Mucus droplets in HT29-MTX cells showed strong reactivity (Fig. 1c).


Figure 1: Immunofluorescence detection of gastric mucins with the polyclonal antiserum L56/C. Indirect immunofluorescence staining, with L56/C antibodies, of ethanol-fixed cryostat sections of normal gastric mucosa (a), normal colon mucosa (b), and postconfluent HT29-MTX cells (day 21 after seeding) (c). Note that gastric and MTX mucus secretions are immunoreactive, whereas no mucus positivity is observed in colon. Bar = 100 µm.



Since these antibodies were not prepared against purified gastric mucins and the epitopes that they recognize had not been identified, reactivity of the antiserum against native and deglycosylated mucins isolated from HT29-MTX cells, and against the tandem repeat peptide-KLH conjugates (see ``Materials and Methods''), was analyzed by ELISA. As shown in Fig. 2, the antibodies recognize both native and deglycosylated mucins purified from HT29-MTX cells. In contrast, they do not react with the synthetic peptides corresponding to the mucin tandem repeats. These results suggested that the antiserum L56/C contains antibodies that are reactive either with epitopes outside the tandem repeat domains of the MUC1-MUC6 apomucins or reactive with a new apomucin.


Figure 2: Reactivity of the polyclonal antiserum L56/C with MTX mucins and peptide-KLH conjugates in ELISA assays. 20 µg/ml native () or deglycosylated () MTX mucins, KLH or peptide-KLH conjugates (MUC1, MUC2, MUC3, MUC4, MUC5AC, MUC6) were coated onto plates and incubated with serial dilutions of antiserum. The basal reactivity level of L56/C with KLH was first subtracted from MUC-KLH conjugate data. The assays were conducted in duplicate and on two occasions. The levels of reactivity of L56/C with the different MUC peptides were very similar, and only MUC5AC result is represented () in order to simplify the figure. No reactivity was detected with the synthetic peptides, whereas strong reactivity with native and deglycosylated mucins is noted.



Isolation and Sequencing of Mucin cDNAs

The antiserum L56/C was therefore used to screen approximately 300,000 recombinants of the HT29-MTX cDNA library, and 37 positive clones were isolated. Of these, 18 clones were related, as determined by hybridization analysis (data not shown). The size of the inserts varied from 3,310 to 300 bp. The sequence of the largest clone, named L31, was determined in its entirety on both strands and is given in Fig. 3 . Other clones used in this study (L18, L10, L21, L34, and L17) were partially sequenced, manually or automatically, and correspond to L31 sequence from position 337, 790, 1216, 1279, and 1507 to 3` end, respectively. The L31 cDNA sequence is 3,310 bp long with a 70% (G+C) content and ends with a 22-bp poly(A) tail. Comparison with other cDNA sequences from mucins showed that the L31 sequence is very similar (98.6% of identity) to clone NP3a (nucleotide 273 to the 3` end) isolated from a nasal polyp cDNA library by Meerzaman et al.(17) . The comparison revealed 14 insertions, 3 deletions, 7 inversions, and 9 substitutions (presented in Fig. 3 ) all along the L31 sequence. Each deletion involved a single nucleotide, while insertions involved one to three nucleotides. These differences were confirmed on both strands by at least three sequencing reactions with dGTP or deaza-GTP. Some of the differences detected were assessed by digestion with restriction enzymes (Fig. 4). In each case the fragment number and sizes were as predicted from the L31 sequence.


Figure 3: cDNA sequence of the clone L31. Sequence differences from the NP3a clone (GenBank accession number U06711, 1993) (17) are shown: nucleotide substitutions or inversions (underlined twice), nucleotide insertions (), or nucleotide deletions (▾). The stop codon (TGA) and the potential polyadenylation signal at the 3` end of the clone are underlined once.




Figure 4: Analysis of L31 restriction fragments. A, restriction maps showing sites for five enzymes, AccIII, BamHI, BglI, MslI, and NaeI, within the EcoRI-KpnI L31 insert (3342 bp), were predicted from sequence analysis. The sites that are absent in the NP3a sequence are indicated with , and sites that are present in the NP3a but not in L31 are represented by . The vertical bar corresponds to sites present in both sequences. B, L31 insert (lane2) was digested by BamHI (lane3), BglI (lane4), NaeI (lane5), MslI (lane7), and AccIII (lane8). Fragments were separated on a 1.5% agarose gel and visualized by ethidium bromide staining. Their sizes were determined by comparison with BstEII (Biolabs, Beverley, MA) (lane1) and the 100-base pair ladder (Pharmacia) (lane6) molecular weight markers. Note that all the fragments have the expected size; partial digestion with NaeI is due to site preference (Biolabs).



The open reading frame of L31 cDNA encodes a 1042-amino acid polypeptide rich in cysteine (9.2%) (Fig. 5). Comparative analysis of L31 and NP3a deduced amino acid sequences shows 83.5% identity (Fig. 5). Large identical regions (>98%) alternate with four smaller regions with a low level of identity (<12.6%). The regions that diverge result from nucleotide insertions and deletions that introduce several shifts in the reading frame and are shown in Fig. 5. There is one additional amino acid in each of the first two domains, and a 74-amino acid extension of the carboxyl-terminal region; the first in-frame stop codon (TGA) is located at position 3129-3131 because of the nucleotide insertions and deletions. The untranslated sequence of clone L31 is therefore shorter. Immunoblot analysis of bacterial lysates revealed that the electrophoretic mobilities of the fusion proteins expressed in XL1 blue by clones L31, L18, L10, L21, or L17 were in good agreement with that expected if the first stop codon is at position 3129-3131 (data not shown).


Figure 5: Comparison of the deduced amino acid sequences of clones L31 and NP3a. The sequences are aligned with gaps inserted to give maximum identity. Four regions (underlined once) show low similarity with NP3a. Amino acids that are not identical in L31 and NP3a conserved regions are underlined twice. Two repetitive sequences, SSSVA () and TTVGPTTVGS (- ), are conserved in the two apomucins. Cysteine residues are marked with asterisks. The majority of potential N-glycosylation sites (▾) are present in both sequence. Stop codons are indicated by .



The changes in reading frame led to notable differences in the deduced amino acid composition of non-conserved regions. In particular, there is an increased number of Cys residues all along the L31 sequence, 15 out of 16 being located in the non-conserved regions (Fig. 5). A number of other nucleotide changes were detected and in 6/10 cases led to amino acid substitutions in Arg codons. The deduced amino acid sequence of L31 contains 12 potential N-glycosylation sites, 10 of which are present in clone NP3a.

The TTVGPTTVGS tandem repeats described by Meerzaman et al. (17) and 4 degenerate SSSVA tandem repeats are present in both sequences (Fig. 5).

Sequence Similarities between L31 Peptide and Other Proteins

The deduced amino acid sequence of clone L31 also shows 36.3% identity to part of the carboxyl-terminal region of MUC2 apomucin. This region, which extends from amino acid 4277 to 5173, includes the Cys-rich D4 domain of MUC2. Lower level of identity (21.3%) is found with the pro vWF, from amino acid 1857 to 2654. Comparison of this domain in L31 and the four related D domains in MUC2 and in the pro vWF (8) reveals that a subregion at the start of each domain shows a particularly high degree of similarity across all the sequences. Fig. 6a shows the comparison and the derivation of a consensus sequence. This sequence is not present in NP3a. The L31 sequence has no significant similarity with the published sequences of MUC1, MUC3, MUC4, MUC5AC, MUC5B, MUC6, or MUC7 proteins. However the first 35 amino acids of clone L31 show a very high degree of similarity to short amino acid sequences in the Cys-rich domains described recently in MUC5AC (13) and are also present three times in a HGM clone coding for a partial gastric mucin.The peptide encoded by the clone NP3a also contained this short sequence. A similar sequence was also found twice in the amino-terminal part of MUC2 (Fig. 6b). From this sequence comparison, a consensus sequence was determined (Fig. 6b).


Figure 6: Comparison of L31 peptide sequences with other protein sequences. a, alignment of the NH-terminal part of the D domains D, D, D, and D of MUC2, and the pro vWF. Note that D, D, and D occur in the amino-terminal half of MUC2 and the pro vWF, and D is in the carboxyl-terminal half. b, alignment of a Cys-rich sequence present in L31 with similar sequences present in HGM (gastric mucin), JER47, JER62, and JUL32 (MUC5AC), and the amino-terminal region of MUC2. Two consensus sequences are deduced from these comparisons and correspond to a part of the consensus sequences previously described by Gum et al. (8) and Guyonnet et al. (13), respectively. Residues identical in more than 7 sequences or in 5 and 6 sequences are indicated with capital letters or lowercase letters, respectively.



Chromosomal Localization

In situ hybridization using clone L 31 was conducted on two different sets of experiments and a total of at least 50 dividing cells were examined. Despite the fact that the signal was weak and not detectable on some chromatids, the hybridization was localized specifically and unambiguously in the most distal short arm band of chromosome 11, 11p15.5 (Fig. 7), where a cluster of mucin genes (MUC2, MUC5AC, MUC5B, and MUC6) is located (15, 36, 37) .


Figure 7: Fluorescent in situ hybridization with clone L31 probe. This partial metaphase shows one signal per chromosome 11.



Expression Studies of L31 using Northern Blotting

Since the expression of mucin mRNAs differs markedly in HT29-MTX and HT29-FU cells and varies according to the stage of the culture, we used mRNA from these cells to compare the expression of L31 mRNA with that of MUC2, MUC5AC, and MUC5B mRNAs (Fig. 8A). Large transcripts (12 kb) were detected in both HT29-MTX and HT29-FU cells. A high level of L31 mRNA was found in postconfluent as compared with preconfluent HT29-MTX cells. No such increase in expression occurred in HT29-FU cells where the levels remained low in post-confluent cells. This differential and growth-related expression of L31 mRNA is similar to that of MUC5AC but greatly differs from that observed with MUC2 and MUC5B probes (Fig. 8) or with MUC6, which is poorly expressed in MTX cells (26) .


Figure 8: Northern blot analysis of HT-29 mucus-secreting cells. Poly(A) RNAs from HT29-MTX (MTX) and HT29-FU (FU) cells in relation to cell growth (7, 14, and 21 days after seeding) were hybridized with L31, MUC5AC, MUC5B, MUC2, and actin cDNAs. Note that the pattern of expression of the MUC5AC transcripts is very similar to that of L31 mRNA.



Large and polydisperse transcripts of L31 clone (from 12 kb to 1 kb) were observed in normal stomach (antrum and fundus) (Fig. 9), but not in stomach cancers (data not shown). No hybridization signal was detectable in samples from normal colon, duodenum, or gallbladder (Fig. 9) or from uterus, ovary, or thyroid (data not shown). The same pattern of tissue expression was obtained with JER58 probe (MUC5AC) (data not shown). MUC6, a mucin cDNA isolated from a gastric expression library (15) , was found to be strongly expressed in the stomach and in gallbladder, unlike L31 (data not shown).


Figure 9: Northern blot analysis of tissus. Total RNA was extracted from normal antrum (lanes1 and 2), fundus (lanes3 and 4), colon (lane5), duodenum (lane6), and gallbladder (lane7). The same result was obtained with the L31 or MUC5AC cDNAs. The arrows correspond to the position of 28 and 18 S RNAs.



Southern Blotting Analysis

Southern blot analysis was conducted on human genomic DNA from at least 4 individuals using a variety of restriction enzymes. In all cases the membranes were probed with L31 and the MUC5AC probe JER58. Particular attention was paid to enzymes that do not cut within the L31 sequence. L31 detected a single ScaI band of 9.5 kb, which was quite distinct from the single band of greater than 18 kb detected with JER58 (Fig. 10). In contrast, with EcoRI, HindIII, and XbaI the single large band detected in each case was of the same mobility with both probes and corresponded to fragments of 20-30 kb for EcoRI (Fig. 10) and approximately 20 kb for XbaI and HindIII (data not shown).


Figure 10: Southern blot analysis. Four human genomic DNAs were digested with EcoRI and four with ScaI. The same two membranes were probed with L31 and JER58 (MUC5AC) cDNAs. The same single EcoRI fragment is detected with the both probes, whereas two distinct ScaI fragments hybridize with clone L31 and MUC5AC.




DISCUSSION

In previous studies (see Ref. 1 for review), the use of antisera raised against deglycosylated mucins to screen expression libraries has led to the preferential isolation of clones containing repetitive sequences, possibly due to signal amplification resulting from the peptide repeats or to their greater immunogenicity. The isolation of mucin sequences corresponding to the ``unique'' domains has required the secondary screening of libraries with cDNAs containing repetitive sequences as for MUC5AC clones (13) , or use of PCR-amplified products encoding peptides identified through conventional biochemical techniques as for nasal polyp mucin (17) , or using RACE (rapid amplification of cDNA ends)-PCR and anchor-PCR as for MUC2 NH and COOH terminus (7, 8) . In this work, we have isolated ``unique'' mucin sequences using an antiserum prepared against a gastric mucosal fraction containing native mucins.

The nucleotide sequence of the largest clone L31 showed a very high level of identity (98.6%) to the clone NP3a, isolated recently from a nasal polyp library (17) . Surprisingly, the deduced amino acid sequence of the peptides encoded by clones L31 and NP3a showed four short regions of very low similarity, due to changes in reading frame. Lower similarity was also observed between the L31 peptide and the carboxyl-terminal region of MUC2 (36.3%) and was maintained all along the L31 protein, including the four non-conservative regions mentioned above. A striking conservation of the number and position of the Cys residues was also found. These observations provide strong evidence that clone L31 forms the 3` end of a mucin gene. The relationship to the NP3a sequence is much harder to evaluate, but it should be noted that cross-hybridization of the targets would be expected for all the hybridization procedures used here because of the 98.6% identity of the two sequences.

Meerzaman and colleagues (17) proposed that NP3a corresponds to the 3` end of the MUC5 gene, because, like in one of partial MUC5 cDNA clones, JER 47, it contained the sequence of peptides TR-3A and TR-3B, which they had previously determined directly from tracheobronchial mucin protein fragments (38) , and also because NP3a maps to chromosome 11. However, it has more recently been shown that there are two distinct MUC5 genes, temporarily designated MUC5AC and MUC5B (13) . From the partial cDNA sequences, it is clear that cysteine-rich domains which contain the TR-3A and the TR-3B peptides occur several times within the MUC5AC gene and that these regions also show some similarity to regions of MUC2 and MUC5B, although the similarity with MUC5B is only at the nucleotide level (13) . All three genes map to chromosome 11p15 (15, 36, 37) , and we show here that clone L31 maps to the same region. The pattern of expression of the mRNA transcripts hybridizing with L31 in the HT-29 subpopulations and in normal and tumor tissues corresponds to that of MUC5AC and is distinct from MUC5B, MUC2 and MUC6. Southern blot analysis indicates that the JER58 (MUC5AC) and L31 sequences are located on the same genomic fragment (20 kb) produced by digestion with three different restriction enzymes. The combined results suggest strongly that clone L31 represents the 3` end of the MUC5AC gene.

It is possible that some of the differences in sequence between L31 and NP3a are due to genetic polymorphism, but this seems unlikely to be the case for those differences that lead to changes in reading frame. The possibility that the two clones correspond to the 3` ends of two adjacent highly similar MUC5AC genes also seems unlikely because this would mean that the 3` end of both genes must be located on the single 9.5-kb ScaI fragment while the tandem repeat regions were located on the same 20-kb fragment. The possibility that these differences result from rearrangement or splicing is thus currently being considered. The idea of alternate 3` end splicing is also suggested by the report of another candidate 3` end clone, JER51 (13) . In this clone, the stop codon interrupts the TTSTTSAP repeat domain and yields a carboxyl-terminal peptide with a very different overall structure. Guyonnet Duperat et al. have proposed that different carboxyl termini might exist in the case of MUC5AC (13) . However, JER51 could also be located internally in the transcript. More work is necessary to understand the relationship between the NP3a, L31, and JER51 clones.

Two short sequences of the deduced L31 peptide seem to represent motifs that are conserved in other mucins and mucin-related proteins. The first of these located between positions 1 and 35 in L31, is found three times in MUC5AC partial cDNAs, twice in MUC2, and three times in HGM clone (a potential MUC5AC cDNA that overlaps with JUL32 and the 5` end of NP3a clones). In each case this sequence is located at junctions with TSP-rich domains (from 28 to 77 in L31) (7, 13) . The second sequence, from amino acid 216 to 246, is not present in NP3a and shows a high level of similarity to part of the four D domains of MUC2 and the pro vWF. We have not found these two regions in any of the other human mucin cDNAs. This evidence of conservation suggests functional importance and also the possibility of intermolecular cross-linking. More sequence and structural information is required before any evolutionary models can be constructed.

It is important to emphasize that the L31 clone was isolated from a cancer cell library and that alterations of mucin expression have been described in tumor tissues (see Ref. 1 for review). A notable feature of HT29-MTX cells is their type of differentiation. Although this cell line is derived from a human colon tumor, it synthesizes and secretes mucins with gastric immunoreactivity (23, 25) . HT29-MTX cells also express the brush border-associated proteins: dipeptidylpeptidase IV, villin, and carcinoembryonic antigen (23) . These features are similar to those observed for the colon of early gestational fetuses, which also expresses the same brush border-associated proteins (39) and mucins with gastric immunoreactivity (40) . Few studies have been reported concerning the developmental expression of mucin genes. However, as in MTX cells, it seems that several mucin genes are expressed in colon of 12 weeks of gestation (41).() MUC5AC transcripts have been detected in 12-week-old fetal colon but were no longer detectable later. The decrease of MUC5AC expression in colon is consistent with the disappearance of mucins with gastric immunoreactivity during development (40) .() Another feature of MTX cells is that mucin mRNAs are detectable by Northern blot analysis as a single major large transcript (25) , whereas large polydisperse transcripts are detected in tissue samples (1, 6, 11, 12) . Although it has not been confirmed that the multiple transcripts result from alternative splicing, the presence of high levels of a single transcript in HT29-MTX cells should simplify studies on the mechanisms of regulation at the transcriptional level and facilitate the isolation of the complete cDNA corresponding to a single transcript. The 5` ends of mucin genes have so far been difficult to obtain, in part due to the large size of mucin transcripts. The antiserum used here may facilitate the isolation of 5` clones from a random primed cDNA library.


FOOTNOTES

*
This work was supported in part by the Association pour la Recherche sur le Cancer, Grants SAL90-0853 from Comisión Interministerial de Ciencia y Tecnologa and 94/1228 from Fondo de Investigacion Sanitaria (Madrid, Spain), with additional support from EUROGEM and NATO Grant 0789188. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked ``advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBank/EMBL Data Bank with accession number(s) Z48314.

§
Recipient of a fellowship from the Association pour la Recherche sur le Cancer. To whom correspondence should be addressed. Tel.: 33-1-45-59-50-41; Fax: 33-1-46-77-02-33.

Supported by a Human Genome Mapping project studentship from the Medical Research Council.

The abbreviations used are: FU, 5-fluorouracil; MTX, methotrexate; pro vWF, pro von Willebrand factor; PCR, polymerase chain reaction; PBS, phosphate-buffered saline; ELISA, enzyme-linked immunoabsorbent assay; KLH, keyhole limpet hemocyanin; BSA, bovine serum albumin; bp, base pair(s); kb, kilobase pair(s).

J.-P. Aubert and N. Porchet, personal communication.

T. Lesuffleur, unpublished results.

Klomp, L. W. J., Van Rens, L., and Strous, G. J. (1995) Biochem. J., in press.


ACKNOWLEDGEMENTS

We thank Teresa Adell and Estanis Navarro for cDNA library construction; David Andreu, Carmen Bolós, and Marta Garrido for the preparation of synthetic peptide conjugates and deglycosylated MTX mucins; and Amy Brar-Rai and Wendy Pratt for the synthesis of some of the oligonucleotides and technical assistance. We are also grateful to James Gum and Young Kim for the gift of MUC2 probe, and Nicole Porchet and Jean-Pierre Aubert for MUC5AC and MUC5B probes.


REFERENCES
  1. Lesuffleur, T., Zweibaum, A., and Real, F. X.(1994) Crit. Rev. Oncol./Hematol. 17, 153-180 [Medline] [Order article via Infotrieve]
  2. Rose, M. C.(1992) Am. J. Physiol. 263, L413-L429
  3. Gendler, S. J., Lancaster, C. A., Taylor-Papadimitriou, J., Duhig, T., Peat, N., Burchell, J., Pemberton, L., Lalani, E.-N., and Wilson, D. (1990) J. Biol. Chem. 265, 15286-15293 [Abstract/Free Full Text]
  4. Lan, M. S., Batra, S. K., Qi, W., Metzgard, R. S., and Hollingsworth, M. A.(1990) J. Biol. Chem. 265, 15294-15299 [Abstract/Free Full Text]
  5. Ligtenberg, M. J. L., Vos, H. L., Gennissen, A. M. C., and Hilkens, J. (1990) J. Biol. Chem. 265, 5573-5578 [Abstract/Free Full Text]
  6. Gum, J. R., Byrd, J. C., Hicks, J. W., Toribara, N. W., Lamport, D. T. A., and Kim, Y. S.(1989) J. Biol. Chem. 264, 6480-6487 [Abstract/Free Full Text]
  7. Gum, J. R., Hicks, J. W., Toribara, N. W., Rothe, E. M., Lagace, R. E., and Kim, Y. S.(1992) J. Biol. Chem. 267, 21375-21383 [Abstract/Free Full Text]
  8. Gum, J. R. J., Hicks, J. W., Toribara, N. W., Siddiki, B., and Kim, Y. S.(1994) J. Biol. Chem. 269, 2440-2446 [Abstract/Free Full Text]
  9. Bobek, L. A., Tsai, H., Biesbrock, A. R., and Levine, M. J.(1993) J. Biol. Chem. 268, 20563-20569 [Abstract/Free Full Text]
  10. Gum, J. R., Hicks, J. W., Swallow, D. M., Lagace, R. L., Byrd, J. C., Lamport, D. T. A., Siddiki, B., and Kim, Y. S.(1990) Biochem. Biophys. Res. Commun. 171, 407-415 [Medline] [Order article via Infotrieve]
  11. Porchet, N., Van Cong, N., Dufosse, J., Audié, J. P., Guyonnet-Duperat, V., Gross, M. S., Denis, C., Degand, P., Bernheim, A., and Aubert, J. P.(1991) Biochem. Biophys. Res. Commun. 175, 414-422 [Medline] [Order article via Infotrieve]
  12. Aubert, J. P., Porchet, N., Crepin, M., Duterque-Coquillaud, M., Vergnes, G., Mazzuca, M., Debuire, B., Petitprez, D., and Degand, P. (1991) Am. J. Respir. Cell. Mol. Biol. 5, 178-185 [Medline] [Order article via Infotrieve]
  13. Guyonnet Duperat, V., Audié, J. P., Debailleul, V., Laine, A., Buisine, M. P., Zouitina-Galiegue, S., Pigny, P., Degand, P., Aubert, J. P., and Porchet, N.(1995) Biochem. J. 304, 211-219
  14. Dufosse, J., Porchet, N., Audié, J. P., Guyonnet-Duperat, V., Laine, A., Van-Seuningen, I., Marrakchi, S., Degand, P., and Aubert, J. P.(1993) Biochem. J. 293, 329-337 [Medline] [Order article via Infotrieve]
  15. Toribara, N. W., Roberton, A. M., Ho, S. B., Kuo, W. L., Gum, E., Hicks, J. W., Gum, J. R., Byrd, J. C., Siddiki, B., and Kim, Y. S. (1993) J. Biol. Chem. 268, 5879-5885 [Abstract/Free Full Text]
  16. Shelton-Inloes, B. B., Broze, G. J., Miletich, J. P., and Sadler, J. E. (1987) Biochem. Biophys. Res. Commun. 144, 657-665 [Medline] [Order article via Infotrieve]
  17. Meerzaman, D., Charles, P., Daskal, E., Polymeropoulos, M. H., Martin, B. M., and Rose, M. C.(1994) J. Biol. Chem. 269, 12932-12939 [Abstract/Free Full Text]
  18. Ho, S. B., Niehans, G. B., Lyftogt, C., Yan, P. S., Cherwitz, D. L., Gum, E. T., Dahyia, R., and Kim, Y. S.(1993) Cancer Res. 53, 641-651 [Abstract]
  19. Audié, J. P., Janin, A., Porchet, N., Copin, M. C., Gosselin, B., and Aubert, J. P.(1993) J. Histochem. Cytochem. 41, 1479-1485 [Abstract/Free Full Text]
  20. Carrato, C., Balagué, C., De Bolós, C., Gonzalez, E., Gambs, G., Planas, J., Perini, J.-M., Andreu, D., and Real, F. X.(1994) Gastroenterology 107, 160-172 [Medline] [Order article via Infotrieve]
  21. Gambs, G., de Bolós, C., Andreu, D., Franc, C., Egea, G., and Real, F. X.(1993) Gastroenterology 104, 93-102 [Medline] [Order article via Infotrieve]
  22. Balagué, C., Gambs, G., Carrato, C., Porchet, N., Aubert, J.-P., Kim, Y. S., and Real, F. X.(1994) Gastroenterology 106, 1054-1061 [Medline] [Order article via Infotrieve]
  23. Lesuffleur, T., Barbat, A., Dussaulx, E., and Zweibaum, A.(1990) Cancer Res. 50, 6334-6343 [Abstract]
  24. Lesuffleur, T., Kornovski, A., Luccioni, C., Muleris, M., Barbat, A., Beaumatin, J., Dussaulx, E., Dutrillaux, B., and Zweibaum, A.(1991) Int. J. Cancer 49, 721-730 [Medline] [Order article via Infotrieve]
  25. Lesuffleur, T., Porchet, N., Aubert, J. P., Swallow, D., Gum, J. R., Kim, Y. S., Real, F. X., and Zweibaum, A.(1993) J. Cell Sci. 106, 771-783 [Abstract/Free Full Text]
  26. Kitamura, H., Gum, J. R., Lee, B.-H., Siddiki, B., Toribara, N. W., Yonezawa, S., Ho, S. B., Lesuffleur, T., Zweibaum, A., and Kim, Y. S. (1994) Gastroenterology 106, (suppl.), A403 (abstr.)
  27. Zweibaum, A., Oriol, R., Dausset, J., Marcelli-Barge, A., Ropartz, C., and Lanset, S.(1975) Tissue Antigens 6, 121-128 [Medline] [Order article via Infotrieve]
  28. Oriol, R., Rousset, M., Zweibaum, A., Dalix, A.-M., Chevalier, G., Dussaulx, E., and Strecker, G.(1977) Immunology 32, 131-137 [Medline] [Order article via Infotrieve]
  29. Zweibaum, A., and Steudler, V.(1969) Nature 223, 84-86 [Medline] [Order article via Infotrieve]
  30. Dessen, P., Fondrat, C., Valencien, C., and Mugnier, C.(1990) Cabios 6, 355-356 [Medline] [Order article via Infotrieve]
  31. Risavy, F. R., Bishop, M. J., Gibbs, G. P., and Williams, G. W.(1992) Comput. Appl. Biosci. 8, 149-154 [Abstract]
  32. Pinkel, D., Straume, T., and Gray, J. W.(1986) Proc. Natl. Acad. Sci. U. S. A. 83, 2934-2938 [Abstract]
  33. Harvey, C. B., Fox, M. F., Jeggo, P. A., Mantei, N., Povey, S., and Swallow, D. M.(1993) Ann. Hum. Genet. 57, 179-185 [Medline] [Order article via Infotrieve]
  34. Chomczynski, P., and Sacci, N.(1987) Anal. Biochem. 162, 156-159 [CrossRef][Medline] [Order article via Infotrieve]
  35. Cleveland, D. W., Lopata, M. A., McDonald, R. J., Cowan, N. J., Rutter, W. J., and Kirschner, M. W.(1986) Cell 20, 95-105
  36. Griffiths, B., Mathews, D. J., West, L., Attwood, J., Povey, S., Swallow, D. M., Gum, J. R., and Kim, Y. S.(1990) Ann. Hum. Genet. 54, 277-285 [Medline] [Order article via Infotrieve]
  37. Nguyen, V., Aubert, J. P., Gross, M. S., Porchet, N., Degand, P., and Frézal, J.(1990) Hum. Genet. 86, 167-172 [Medline] [Order article via Infotrieve]
  38. Rose, M. C., Kaufman, B., and Martin, B. M.(1989) J. Biol. Chem. 264, 8193-8199 [Abstract/Free Full Text]
  39. Zweibaum, A., Hauri, H. P., Sterchi, E., Chantret, I., Haffen, K., Bamat, J., and Sordat, B.(1984) Int. J. Cancer 34, 591-598 [Medline] [Order article via Infotrieve]
  40. Bara, J., Gautier, R., Daher, N., Zaghouani, H., and Decaens, C.(1986) Cancer Res. 46, 3983-3989 [Abstract]
  41. Chambers, J. A., Hollingsworth, M. A., Trezise, A. E. O., and Harris, A.(1994) J. Cell Sci. 107, 413-424 [Abstract/Free Full Text]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.