Evolution and Classification of Cystine Knot-Containing Hormones and Related Extracellular Signaling Molecules

Ursula A. Vitt, Sheau Y. Hsu and Aaron J. W. Hsueh

Division of Reproductive Biology Department of Gynecology and Obstetrics Stanford University School of Medicine Stanford, California 94305-5317


    ABSTRACT
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
The cystine knot three-dimensional structure is found in many extracellular molecules and is conserved among divergent species. The identification of proteins with a cystine knot structure is difficult by commonly used pairwise alignments because the sequence homology among these proteins is low. Taking advantage of complete genome sequences in diverse organisms, we used a complementary approach of pattern searches and pairwise alignments to screen the predicted protein sequences of five model species (human, fly, worm, slime mold, and yeast) and retrieved proteins with low sequence homology but containing a typical cystine knot signature. Sequence comparison between proteins known to have a cystine knot three-dimensional structure (transforming growth factor-ß, glycoprotein hormone, and platelet-derived growth factor subfamily members) identified new crucial amino acid residues (two hydrophilic amino acid residues flanking cysteine 5 of the cystine knot). In addition to the well known members of the cystine knot superfamily, novel subfamilies of proteins (mucins, norrie disease protein, von Willebrand factor, bone morphogenetic protein antagonists, and slit-like proteins) were identified as putative cystine knot-containing proteins. Phylogenetic analysis revealed the ancient evolution of these proteins and the relationship between hormones [e.g. transforming growth factor-ß (TGFß)] and extracellular matrix proteins (e.g. mucins). They are absent in the unicellular yeast genome but present in nematode, fly, and higher species, indicating that the cystine knot structure evolved in extracellular signaling molecules of multicellular organisms. All data retrieved by this study can be viewed at http://hormone.stanford.edu/.


    STRUCTURAL FEATURES OF KNOWN 10-MEMBERED CYSTINE KNOT PROTEINS
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
Cysteine residues in amino acid chains are essential for disulfide bonding and loop formation to establish functional motifs in the tertiary structure of different proteins. A consensus cysteine framework important for a homologous folding motif was originally identified in mammalian endothelin and insect-derived neurotoxins (1, 2). This typical framework consists of four cysteine residues with a cysteine spacing of Cys-(X)3-Cys and Cys-X-Cys, important for a ring structure formed by eight amino acids. This ring structure is conserved in many different groups of proteins including the cystine knot superfamily (3). The cystine knot family members contain two additional cysteines that form a third disulfide bond that penetrates the ring structure, thus forming a cystine knot with 10 amino acids of which six are cysteine residues (4). The intrusion of the additional disulfide bond through the cystine ring confines the amino acid residue between the second and third cysteine to a glycine, as any other amino acid at this position would imply severe steric hindrance for the formation of the knot (5, 6). Thus, the consensus sequence for the 10-membered cystine knot structure is:


Cysteines 2, 3, 5, and 6 form a ring by disulfide bonding between cysteines 2 and 5 as well as 3 and 6. The third disulfide bond, formed by cysteines 1 and 4, penetrates the ring, thus forming a knot (Fig. 1Go).



View larger version (22K):
[in this window]
[in a new window]
 
Figure 1. Schematic Drawing of the 10-Membered Cystine Knot Structure

Arrows indicate the direction (N to C terminal) of the amino acid chain. SS indicates disulfide bonds. The six cysteines involved in knot formation are numbered consecutively and their spacing is given in the lower panel. Cysteines 2 and 3 form disulfide bonds with cysteines 5 and 6, respectively, thus forming a ring. The ring is penetrated by the third disulfide bond formed between cysteines 1 and 4. The amino acid chains between cysteines 1 and 2 and between 4 and 5 typically form finger-like projections, whereas the segment between cysteines 3 and 4 forms an {alpha}-helical structure and is designated as a heel. In some of the known cystine knot proteins an additional cysteine is located in front of cysteine 4, which was found to be essential for covalent dimer formation.

 
The three-dimensional structure common to the members of the cystine knot superfamily is responsible for features shared by this family. To date, all known 10-membered cystine knot proteins have been found to be extracellular proteins, interacting with specific receptors and/or other extracellular proteins. The cystine knot is formed intracellularly and prevents the formation of a globular protein structure commonly found for other extracellular peptides. The cystine knot folding of these proteins leads to the formation of three distinct domains. Two of them, between cysteines 1 and 2, as well as 4 and 5, consist of antiparallel ß-strands that form finger-like projections, whereas the third domain between cysteines 3 and 4 usually contains an {alpha}-helical structure (7, 8). Due to the arrangement of these domains, the three-dimensional structure of these proteins has been referred to as a "hand" containing two fingers and a middle heel. Therefore, the cystine knot forces the protein to adapt a three-dimensional arrangement that, in part, exposes hydrophobic residues to the aqueous surrounding. These hydrophobic residues lead to the formation of homo- or heterodimers that have been described for selective members of all subfamilies (6, 7, 9). Some members of the cystine knot superfamily have, in front of cysteine 4, an additional cysteine residue that strengthens dimerization by forming a covalent disulfide bridge between the two subunits of the dimer (10, 11).

The largest 10-membered cystine knot subfamily is the TGFß family that consists of transforming growth factors, bone morphogenetic proteins (BMP), growth differentiation factors, inhibins, and Mullerian inhibiting substance (12, 13, 14, 15, 16). Most of these hormones play essential roles during early embryonic development and, in adult life, have diverse physiological roles in cell cycle regulation, modification of the extracellular matrix, and the regulation of other growth factors (16, 17, 18, 19, 20). All members of the TGFß subfamily have a stop codon at the second position after the sixth cysteine of the cystine knot. Other subfamilies, such as platelet- derived growth factors (PDGFs) and glycoprotein hormones (GPHs) display an identical cystine knot arrangement, but have a longer sequence after the sixth cysteine (21, 22). Several other proteins with low homology to the characterized cystine knot proteins have been suggested to display a cystine knot structure. These include BMP antagonists (23), the norrie disease gene product (NDP) (24) as well as the von Willebrand factor (vWf) (25). In addition to the well known 10-membered cystine knot structure, the neurotrophic growth factor family has been described to adapt a similar cystine knot arrangement even though nine and not three residues are present between the second and the third cysteine, leading to the formation of a 16-membered cystine knot (6).


    BIOINFORMATIC APPROACH TO TRACE THE EVOLUTION OF 10-MEMBERED CYSTINE KNOT PROTEINS IN MODEL ORGANISMS
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
Several members of the TGFß subfamily have been found to have homologs in nematode and fly (26, 27, 28). However, no systematic analysis of the evolution of all TGFß-like, 10-membered cystine knot proteins has been made. With the completion of the sequencing of the human genome, together with that of the nematode, Caenorhabditis elegans, and the fly, Drosophila melanogaster, the protein sequences from organisms belonging to three different animal phyla (arthropods, nematodes, and chordates) can now be compared, which provides an evolutionary perspective on this group of proteins. Studies of sequence homology among these species could reveal the position of crucial amino acid residues conserved in all orthologs. Furthermore, the identification of paralogs and orthologs that share a common ancestor assists in the unraveling of ligand-receptor relationships and gene functions (29).

The proteins of the 10-membered cystine knot superfamily differ in size and contain additional motifs, making their identification difficult by traditional search methods such as pairwise alignments based on BLAST tools (30). In this minireview, we present the results of a pattern-search approach that exhaustively screened the sequenced human, fly, and nematode genome to find all potential cystine knot-containing proteins in the predicted protein sequences. Furthermore, the complete genome of the yeast, Saccharomyces cerevisiae, and available sequences of the slime mold, Dictyostelium discoideum, were screened. We attempted to disclose the 10-membered cystine knot signature in different proteins and revealed evolutionary relationships of cystine knot-containing proteins from different species. Several crucial amino acid residues in the cystine knot signature previously not considered necessary for cystine knot formation were also identified. The present approach allows the classification of subfamilies of the cystine knot superfamily and the discovery of novel homologs in worm and fly for some of the known subfamilies. All data obtained in this study are searchable at the Cystine Knot Protein Database [http://hormone.stanford.edu/cystine-knot].


    PROTEINS WITH A TYPICAL CYSTINE KNOT SIGNATURE
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
More than 100,000 open reading frames from five (yeast, slime mold, nematode, fly, human) model organisms were used to identify proteins with a typical cystine knot signature (Fig. 2Go, step 1). Each of these gene sets was searched for a typical cystine knot signature using a regular expression search. The regular expression used to identify the cystine knot signature was ‘C(X... . )CXGXC(X... . )C(X... . )CXC’. This placement of the six conserved knot-forming cysteines corresponds to the 10-membered cystine knot structure of the TGFß, GPH, and PDGF subfamilies as described by Sun and Davies (6). Because the structurally similar cystine knot of the nerve growth factor subfamily has a 16-membered knot structure, it is not closely related to the other subfamilies and was excluded from the present study. The TGFß subgroup is the largest subfamily consisting of a wide variety of members, which all have a stop codon at the second position after the sixth cysteine. Therefore, this subfamily was extracted from the collected sets of data using a second regular expression (‘CXCXstop’). Redundant sequences were excluded manually during analysis.



View larger version (27K):
[in this window]
[in a new window]
 
Figure 2. Flow Diagram of the Methodological Steps Used

Step 1: The complete set of predicted protein sequences from Saccharomyces cerevisiae, Caenorhabditis elegans, and Drosophila melanogaster were downloaded from the GenBank (http:// www.ncbi.nlm.nih.gov/), Sanger Center (http:// www.sanger.ac.uk/Projects/Celegans/), and Berkeley Flybase (http://www.fruitfly.org/) containing 4,821 yeast, 22,942 worm, and 14,080 fly protein sequences, respectively. The human protein sequences of 82,344 genes (available in August, 2000) were downloaded from GenBank. Furthermore, all available slime mold, Dictyostelium discoideum, sequences (1,672) were downloaded from the GenBank. Step 2: Numbers of retrieved proteins with a cystine knot signature after the pattern search approach. Step 3: Analysis of the characteristics of the cystine knot-containing proteins and their conservation among paralogs and orthologs. The occurrence of a SP was tested by using a signal peptide server (http://www.cbs.dtu.dk/services/ SignalP/#submission). The potential for secretion of both SP-positive and SP-negative sequences was verified using PubMed citations whenever possible. This approach revealed approximately 3% false negative predictions of SP in genes that were known to encode secreted proteins, and no false positives. The sequences were analyzed for transmembrane regions using publicly available servers (http://www.cbs. dtu.dk/services/TMHMM-1.0/ and http://sosui.proteome.bio.tuat. ac.jp/cgi-bin/sosui.cgi?/sosuisubmit.html) and verified using PubMed citations whenever possible. Analyses of protein size and cystine knot size as well as screening for EGF-like motifs (PROSITE: http://expasy.cbr.nrc.ca/, PS00022 and PS01186) were done using Python subroutines. In addition, the retrieved proteins were grouped according to the number of amino acid residues between cysteines 2 and 6 and analyzed for the conservation of cysteine residues not involved in knot formation. Furthermore, the amino acid residues surrounding the cystine knot-forming cysteines were analyzed for conservation among the known cysteine-knot proteins (TGFß, GPH, and PDGF family members) from diverse species and compared with the corresponding residues in the novel potential cystine knot proteins. To group the genes into subfamilies, the human genes containing the cystine knot signature were aligned using the BLAST server. Genes with more than 40% positives over at least 60% of the sequence length were considered potential paralogs. The function and structure of these genes were further verified using PubMed records and the Protein Data Bank, whenever available.

 
The predicted protein sequences with cystine knot motifs were further analyzed based on the protein size, the size of the cystine knot, the presence of a signal peptide for secretion (SP) and/or transmembrane helices. Of the 134 proteins with a typical cystine knot signature found in humans, 84 had a signal peptide for secretion. Among them, 36% had a stop codon at the second position after the sixth cysteine and were considered to be members of the TGFß subfamily. In the fly and the worm, 38% and 68% of the selected proteins had a signal peptide for secretion, but only 30% and 8% of them belonged to the TGFß subfamily, respectively. The remaining set of human proteins with a predicted signal peptide contained, as expected, the GPH (n = 5) and PDGF subfamily members (n = 8) (Fig. 2Go and Table 1Go).


View this table:
[in this window]
[in a new window]
 
Table 1. List of All Known and Putative 10-Membered Cystine Knot Proteins

 
In addition to the well known cystine knot proteins mentioned above, several human mucins, vWf, NDP, and several BMP antagonists (Gremlin, DAN, and Cerberus) displayed the typical cystine knot signature. Furthermore, the present search identified 31 novel potential cystine knot proteins that had either one or two epidermal growth factor (EGF)-like signatures. Other extracellular proteins retrieved included tenascins, attractins, zonadhesins, thyroglobulins, insulin-like growth factor binding proteins (IGFBPs), and many more. The subset of proteins without a signal peptide for secretion contained the parkin gene, as well as several enzymes and ion-carrying proteins, such as arylsulfatase A, topoisomerase III-{alpha}, and the iron- responsive element-binding protein 2.

Because the algorithms used for protein sequence prediction from genomic sequences have been shown to predict only up to 70% of the correct protein sequences (31), one cannot exclude the possibility that additional cystine knot-containing members remain undetected. By direct search of the genomic database we have indeed identified potential fly PDGF and GPH homologs that are not present in the set of predicted proteins.


    FROM CYSTINE KNOT SIGNATURE TO STRUCTURE: PROTEINS WITH A POTENTIAL THREE-DIMENSIONAL CYSTINE KNOT STRUCTURE
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
The cystine knot signature is crucial for the formation of the three-dimensional protein structure and could influence the function of other motifs (such as ligand binding or signaling) in the same protein. Therefore, it is likely to be highly conserved in paralogous proteins in the same species and in orthologs of diverse species. Because the presence of the cystine knot signature does not necessarily lead to the formation of a three-dimensional cystine knot structure, three criteria were used to confirm the cystine knot structure and exclude false positives: 1) The candidate protein belongs to a subfamily with similar function, and the cystine knot arrangement is conserved in regions of high sequence similarity; 2) the human, fly, or nematode orthologs have conserved cystine knot signatures; and 3) a putative transmembrane region does not overlap with the cystine knot signature as this could disrupt cystine knot formation.

The cystine knot family members have conserved cystine knot signatures but show low homology of their overall amino acid sequences. As a result, some of the potential homologs could not be found based on pairwise alignments alone. Therefore, an additional parallel approach was applied to the sequences found to have the 10-membered cystine knot signature. The cystine knot was analyzed according to the length of the amino acid chain between cysteines 2 and 6. The sequences were subsequently grouped and aligned according to the size of this signature and the position of additional cysteine residues. Analysis of the cystine knot size revealed that proteins in each subgroup had a similarly sized cystine knot motif.

As expected, the TGFß, GPH, and PDGF subfamily members showed a conservation of the cystine knot signature among family members in the three model organisms (nematode, fly, and human) analyzed (Fig. 3AGo, subgroups 1–3 and Table 1Go). In addition to these proteins known to contain a cystine knot structure, additional human proteins, which were proposed to have a potential cystine knot structure, were also found. These include proteins of the mucin-like, slit-like, and jagged-like subgroups. The mucin-like subgroup includes the NDP as well as several BMP antagonists, mucin-related proteins, and vWf.



View larger version (46K):
[in this window]
[in a new window]
 
Figure 3. Alignments of Identified Subgroups of Cystine Knot Orthologs

The cysteines involved in cystine knot formation are shown as white letters on black background. The consensus (cons.) sequence for all proteins in each subgroup is given below each alignment. The stars at the amino acid residues immediately flanking cysteine 5 indicate the conserved hydrophilic residues in all known cystine knot proteins. Stop codons are indicated by dollar signs. Species: w, worm; f, fly; h, human. A, Known cystine knot proteins are divided into three subgroups (TGFß, GPH, and PDGF subfamilies). Although several TGFß members from a given species are very similar to each other, they only share the cysteines as conserved residues when all sequences are included in the alignments. B, Putative cystine knot proteins: Subgroup 4 containing proteins described in the literature as potential cystine knot candidates and subgroup 5 containing novel cystine knot proteins. Subgroup 4 includes the three paralogs of BMP antagonists found only in vertebrates as well as the human NDP, the mucin-like genes, and the human vWf with its fly ortholog hemolectin (hml). This subgroup of proteins shows an additional conserved spacing of three cysteines in a stretch of 18 amino acid residues that is not found in the other subgroups (boxed). Subgroup 5: The worm homolog of the slit-like proteins contains a stretch of 15 additional amino acids inserted between cysteines 4 and 5 that is missing in the fly and human orthologs (boxed). C, Probable cystine knot proteins (subgroup 6). In contrast to the other subgroups, the jagged-like proteins have up to 10 additional cysteine residues among those potentially forming a cystine knot, thus rendering the analysis of these proteins difficult. Note that the amino acid residue between cysteines 5 and 6 in fly and human jagged orthologs are hydrophobic (I, L), which is different from the preference of less hydrophobic residues at this position in the members of other subgroups. In addition, the sixth cysteine is followed by a proline, which is not found in any of the other subgroups.

 
Mucin-Like Subgroup
In addition to the cystine knot motif, the human mucins displayed a conserved CXXCX{13}C signature at a comparable position (Fig. 3BGo, gray shading), thus representing a new common functional motif identified here. This motif was also present in vWf, NDP, and the BMP antagonists, but not in the subgroups of the TGFß, GPH, and PDGF subfamily members (Fig. 3AGo). Due to their similar structure, the former were grouped together in subgroup 4 as mucin-like proteins (Fig. 3BGo). Furthermore, the human mucin genes aligned well and showed conservation of the cystine knot signature to two smaller fly proteins of unknown function. The additional cysteines unique to this subgroup of proteins are probably involved in intrachain folds of the subunits. As a result, these proteins likely display loops and exposed residues different from those of the TGFß, GPH, and PDGF subgroups.

Mucins were originally described as the glycoprotein components of epithelial mucus secretions and defined by their high content of O-linked oligosaccharides (32). Based on their functions, mucins can be grouped into the membrane mucins, gel-forming mucins, and small, soluble mucins. Four human gel-forming mucins were identified as cystine knot proteins, with a high similarity to vWf and NDP as well as the BMP antagonists. The C-terminal cystine knot in these mucins has been described to be involved in the formation of covalently linked dimers (33) and multimeric insoluble gels through cross-linking of cysteine-rich domains in their C- and N-termini (34, 35).

Three BMP antagonists belonging to the mucin-like subgroup were identified only in vertebrates. No receptor for this subgroup is known, consistent with the hypothesis that these proteins inhibit BMP signaling by binding to the TGFß subfamily ligands (36, 37). Thus, the additional consensus cysteine motif in these proteins could be important for BMP binding and/or inhibition of signaling. The similarity between mucins and other members of this subgroup (NDP, vWf, and BMP antagonists) and the potential of dimer formation in all these proteins have been postulated (25, 38, 39, 40, 41).

The homology between NDP and other mucin-like proteins indicates that NDP could be a component of the extracellular matrix. Its mutations in the human, causing congenital blindness and often sensineural hearing loss and mental retardation, might be due to a disturbance in the extracellular matrix composition and the resultant disruption of cell-cell interaction. This is underscored by the phenotype of NDP mutant mice, which show snowflake-like opacities within the vitreous (42). Alternatively, NDP might act as a functional antagonist to other cystine knot superfamily signaling ligands.

Another member of the mucin-like subgroup that was identified is the vWf. This factor acts as a blood-clotting agent by propagating agglutination of platelets and their adhesion to the vessel surface. Its function is highly dependent on the formation of complex multi-mers. The carbohydrate portion of this glycoprotein seems to be essential for its interaction with the platelets and the vessel wall (43). The cystine knot region might be essential for dimerization of these proteins, thus stabilizing the formation of multimers (25). Our study identified hml as the fly homolog of vWf.

Slit-Like and Jagged-Like Subgroups
Two additional novel subgroups, previously not known to form a cystine knot, were identified as potential candidates for forming a cystine knot structure. These include the human homologs of the fly slit protein and the jagged proteins. These proteins contain EGF signatures and are cysteine rich, rendering the analysis of their potential for forming a cystine knot structure difficult. The slit-like proteins are characterized by a stop codon shortly after the sixth cysteine and only two members were found in the human. The two human jagged proteins contain several additional conserved cysteines and other residues (Fig. 3Go, B and C).


    PROTEINS EXCLUDED TO HAVE A POTENTIAL CYSTINE KNOT STRUCTURE
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
Several proteins with a cystine knot signature were not included as cystine knot-containing proteins because they showed a high similarity to paralogs without conservation of the cysteine arrangement. In the IGFBP superfamily, IGFBP-6 had a cystine knot signature, but cysteines 2 and 3 were not conserved among family members (Fig. 4AGo, black shading). However, several other adjacent cysteine residues were highly conserved and the potential of forming a cystine knot structure, different from that of the TGFß superfamily, cannot be ruled out (gray shading).



View larger version (60K):
[in this window]
[in a new window]
 
Figure 4. Proteins with a Cystine Knot Signature but Unlikely to Contain a Cystine Knot Structure

A, Exclusion of proteins as potential cystine knot superfamily members due to a lack of cysteine conservation among paralogs. Although IGFBP-6 has a characteristic cysteine arrangement, this signature is not conserved in its paralogs, IGFBP-5 or IGFBP-3. The putative cysteines 2 and 3 involved in the 10-membered cystine knot formation of IGFBP-6 are shown as white letters on a black background. However, a different type of cystine knot might be formed by these proteins as shown by the gray shading of conserved cysteine residues among these paralogs. B, Exclusion of proteins as potential cystine knot superfamily members due to a lack of cysteine conservation among orthologs. Alignment of the human parkin protein and its fly ortholog. The typical cystine knot signature found in the human parkin protein is not conserved in its otherwise highly homologous fly ortholog as the central glycine residue between cysteines 2 and 3 (boxed area) is missing in the fly protein.

 
Multiple other proteins with a cystine knot signature could be excluded as potential cystine knot-forming proteins because their fly and/or worm orthologs did not show conservation of the cystine knot signature (e.g. human parkin, Fig. 4BGo, boxed area). Although dihydropyrimidine dehydrogenase showed an unusually high homology (65%) between human, fly, and worm orthologs, additional cysteine residues were not conserved and no human paralogs could be found.


    UNIQUE FEATURES IDENTIFIED IN CYSTINE KNOT PROTEINS
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
Due to the high number of known cystine knot proteins studied here, it was possible to identify a new conserved pattern that might be essential for cystine knot formation. Analysis of the amino acid residues adjacent and between cysteines 2, 3, 5, and 6 of the cystine knot signature among all known cystine knot proteins (subgroups 1–3), revealed that none of the highly hydrophobic amino acids, Trp, Phe, Tyr, Ile, Leu, Val, and Met, are present at the residues before and after cysteine 5 (Fig. 5Go, shaded areas). Overall, amino acid residues surrounding the knot-forming cysteines did not show high conservation among these proteins. As more than 50 proteins known to form a cystine knot structure were investigated, and the neighboring residues that were studied varied highly, it is unlikely that the lack of hydrophobicity flanking cysteine 5 is due to coincidence. Although it is unclear how hydrophobic residues at these positions might interfere with cystine knot formation, the present finding facilitates the analysis and search for new cystine knot members.



View larger version (16K):
[in this window]
[in a new window]
 
Figure 5. Conservation of Hydrophilicity at the Amino Acid Residues Flanking Cysteine 5 of the Cystine Knot

AA, Amino acids; *, Hopp-Woods hydrophilicity values. The amino acids are ranked in descending order of their hydrophilicity. The + indicates whether this specific amino acid is found at this residue in any of the more than 50 investigated known cystine knot proteins. None of the highly hydrophobic amino acids can be found at the residues flanking cysteine 5 (shaded areas).

 
In proteins of both the mucin-like and the slit-like subgroup, the lack of hydrophobicity at the residues before and after cysteine 5 was also found, consistent with a consensus property in the known cystine knot proteins. However, in the jagged-like subgroup, the residue between cysteines 5 and 6 is Ile or Leu. Furthermore, in this subgroup, the residue after cysteine 6 is a conserved Pro, which cannot be found in any of the other subgroups of proteins. As a result of the presence of this Pro residue, the protein chain adapts a unique bend at this position. Thus, the jagged subgroup is less likely to form a cystine knot with structural features similar to that of the other family proteins. Therefore, only the slit-like proteins were classified as putative cystine knot hormones (Fig. 3BGo, subgroup 5) and the jagged-like proteins are grouped as probable cystine knot proteins (Fig. 3CGo, subgroup 6).


    THE CYSTINE KNOT SIGNATURE HAS A SIMILAR SIZE IN PROTEINS OF VARIANT SIZES
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
As expected, many of the proteins retrieved contained cysteine-rich regions, which increases the likelihood of finding a false-positive cystine knot. In addition, the chances of finding a cystine knot-like signature in larger proteins are higher. Analysis of total protein size revealed striking variations among the retrieved proteins. Many had a size of about 400 amino acids, which is typical for the TGFß subfamily, but others ranged from 55 to 5,400 amino acids in length (Fig. 6Go). Although variable sizes for the cystine knot signature were found in some proteins, all proteins selected as containing a cystine knot structure based on the presented criteria had a similar size in their cystine knot signature regardless of the protein size. The length between cysteines 2 and 6 of the cystine knot varied from 42 amino acids in the slit-like and mucin-like subgroup to 80 amino acids in the TGFß family members.



View larger version (38K):
[in this window]
[in a new window]
 
Figure 6. Comparison between the Length of Cystine Knot Signature and the Total Protein Size for More Than 250 Proteins Retrieved by the Pattern Search Approach

The circles indicate the cystine knot signature size for each individual protein. The length of the cystine knot signature is not related to the total protein size. The cystine knot signature identified in all proteins selected as containing a cystine knot structure (triangles), has a comparable size regardless of the total protein size, thus suggesting the presence of a functional motif rather than a random arrangement of cysteines.

 
Figure 7Go shows the position and size of the cystine knot in relation to the entire protein of all the subgroups under investigation. Except for the jagged-like subgroup, the cystine knot signature is located in the C-terminal region of the protein. The jagged-like proteins are the only subgroup with a transmembrane region downstream of the cystine knot signature. The human mucins display a cystine knot motif that aligns to two fly genes, but vary greatly in size (ranging from 400 to 3,000 amino acids). These proteins also align to the cystine knot region of the fly hml and human vWf protein. The fly mucin-like genes are much smaller, sharing only the C-terminal cystine knot region with the human mucins and lacking the N-terminal carbohydrate-rich region characteristic of the human mucins. The vWf and the fly hml both contain an extended N-terminal region. The vWf is shorter than hml due to a deletion of sequences between the conserved N-terminal region and the cystine knot motif.



View larger version (36K):
[in this window]
[in a new window]
 
Figure 7. Position and Size of the Cystine Knot in Relation to the Total Protein Size in Known and Putative Cystine Knot Proteins

Species: w, worm; f, fly; h, human. Except for the jagged-like proteins, the cystine knot signature is found in the C-terminal segment of all proteins. The jagged-like proteins have a transmembrane region flanking the cystine knot signature. In the TGFß, as well as the PDGF subfamilies, the cystine knot-containing C-terminal segment is cleaved from a precursor. In one of the human mucins, the C-terminal cystine knot is also cleaved from the N-terminal mucin-like region during intracellular processing. Due to this cleavage the remaining C-terminal segments with the cystine knot signature have a similar size among members of different subgroups. Intracellular cleavage does not occur in the larger proteins of the slit-like or jagged-like subgroups, or fly hml and human vWf.

 
Because the size of the cystine knot signature in these proteins is not related to the total protein size, this motif likely represents a structural entity and not a random arrangement of cysteines. These findings suggest that gene fusion events took place during evolution, and the same motif was used in diverse proteins with different functions.


    EVOLUTION OF THE 10-MEMBERED CYSTINE KNOT STRUCTURE
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
Family trees were constructed using multiple alignments of characteristic members of the known and putative 10-membered cystine knot protein subfamilies in nematode, fly, and human. As shown in Fig. 8Go, one distinct branch is formed by the TGFß subfamily, which indicates early evolutionary divergence of this large family. A second branch includes four subgroups: GPH, PDGF, mucin-like, and slit-like. The inclusion of the mucin-like sequences revealed that GPH as well as PDGF family members are more closely related to the cystine knot-containing region of the extracellular mucin-like proteins than to that of the TGFß subfamily. This branch also contains NDP, vWf, and the BMP antagonists. The slit-like subgroup forms a distinct subbranch and thus shows less similarity to other subgroups. This phylogenetic relationship of extracellular signaling molecules indicates a common origin for hormones (e.g. TGFß) and extracellular matrix proteins (e.g. mucins).



View larger version (32K):
[in this window]
[in a new window]
 
Figure 8. Phylogenetic Tree of the Known and Putative Cystine Knot Subgroups Together with Their Dimerization/Oligomerization and Receptor Characteristics

Pairwise alignments of human genes with fly and nematode genes were done by utilizing the BLAST servers of the Berkeley Flybase and the Sanger Center using low stringency criteria (E<1,000, identity>10%). To determine whether any of these potential homologs also contained the above mentioned cystine knot signature, a subroutine was used to check the presence of the cystine knot signature in subsets of genes. For pairwise alignments of the human gene with the respective nonmammalian homologs, the Baylor College of Medicine search launcher (http://dot.imgen.bcm.tmc.edu:9331/seq-search/alignment.html) was used. Multiple protein sequence alignments were done using a pattern construction algorithm (51 52 ). Gaps in the alignments were modified to minimize the number of mutations required to explain all differences between the sequences. Representative sequences for each subgroup were used to infer phylogenetic relationships using the protein sequence parsimony method available in PHYLIP (http://evolution.genetics.washington.edu/phylip.html) (53 ). Furthermore, the phylogenies were analyzed using the Dayhoff PAM matrix (54 55 ). Results of this matrix were further analyzed using the Fitch-Margoliash algorithm (56 ). The degree of confidence for each branchpoint was obtained using the bootstrap method (1,000 replications) (57 58 ). *, The fly PDGF and GPH orthologs were identified by manual inspection of genomic data; they are not present in the set of predicted proteins.

Two main branches can be identified. The first one includes the TGFß subfamily and lefty, whereas the mucin-like proteins, GPH, PDGF subfamily, and slit-like proteins form a separate branch. The NDP as well as the BMP antagonists show close relationship to the mucin-like genes and the hemolectins. Each subgroup was shown to bind to a different type of receptor, which is given on the right, except for proteins of the mucin-like branch, which has no known receptors.

 
The low confidence probability found for some of the branches is due to the low similarity between the different groups, which are mainly conserved at the cysteine residues. This low homology and the fact that the nematode has proteins present in most subgroups make it difficult to trace the common ancestor for these proteins. It is possible that convergent evolution led to the formation of a similar knot structure in the different subfamilies. As shown in Fig. 9Go, all major cystine knot protein groups, including the TGFß, GPH, PDGF, and slit-like subgroups, with the exception of the mucin-like subgroup, contain worm and fly homologs, indicating that these proteins evolved before the separation of the arthropod phylum about 1.2 billion years ago. The vWf and the fly hml homolog could be derived by a gene fusion event before the emergence of the fly, thus combining the cystine knot with other functional motifs. This fusion event took place more than 1 billion, but less than 1.2 billion, years ago, as no ortholog can be found in the worm. A second fusion event combined the cystine knot with the gel-forming N terminus of the human mucin proteins. This fusion occurred later, because the fly mucin-like genes of unknown function do not have the extended N terminus typical for vertebrate mucins. These findings suggest that the characteristic mucins evolved only in the vertebrate phylum. In addition, no homolog of the human BMP antagonists was found in lower species, suggesting the recent evolution of these genes in the chordate phylum. On the other hand, the cystine knot of the TGFß, GPH, and PDGF subfamilies is well conserved in worm indicating their origin to be more ancient than that of the mucin-like proteins.



View larger version (24K):
[in this window]
[in a new window]
 
Figure 9. Schematic Time Scale of the Evolution of Cystine Knot-Containing Proteins in the Five Species Studied

On the right, the respective proteins found in each animal phylum are shown. Mucin-like genes developed only in the arthropod phylum whereas the BMP antagonists developed during vertebrate evolution. All other subgroups are present in worm but no cystine knot protein ortholog was found in available sequences from the slime mold. As no cystine knot proteins were found in yeast, the ancient cystine knot-containing proteins likely evolved in multicellular organisms before the emergence of nematodes.

 

    THE ABSENCE OF CYSTINE KNOT PROTEINS IN YEAST
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
As most identified cystine knot-containing proteins had worm orthologs, proteins with a cystine knot signature were also screened using sequences from lower species including the entire yeast genome and the available slime mold data. Two candidate proteins were found in yeast—the putative maltose permease (a transmembrane protein, GenBank accession no. 1346493) and the metallothionein-like protein CRS5 (an intracellular protein, accession no. 1169097). Because their cystine knot signatures were not conserved in paralogs of the same species, they were excluded as cystine knot proteins. As no cystine knot structure can be identified in yeast, it is unlikely that this structure could have evolved in unicellular organisms. However, the use of four cysteines to form a cystine ring is widely practiced and has been described in protozoan as well as metazoan species (44, 45).

The slime mold revealed three potential sequences, two prestalk proteins and one predicted open reading frame, all with a signal peptide and without transmembrane helices. One of the prestalk proteins aligned well to other family members in the same species but the cysteine residues were not conserved. Pairwise alignment of the two remaining proteins with proteins from other species showed sequence similarity and cystine knot conservation between DIF-induced prestalk pDd63 precursor (accession no. 84105) and an as yet unnamed worm protein (accession no. 6580331) with an EGF signature that was not found to have a cystine knot. The size of the cystine knot signature in this slime mold protein is 90 amino acids and therefore longer than that in the other known cystine knot proteins (up to 80). Thus, it is questionable that this protein represents the ancient ancestor of the cystine knot proteins.


    MOST CYSTINE KNOT PROTEINS ARE EXTRACELLULAR LIGANDS: INTERACTIONS WITH DIVERSE RECEPTORS
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
The importance of the cystine knot signature for secreted proteins has been demonstrated based on studies of cysteine mutants in different family members (46, 47). The cystine knot constrains the folding of these proteins to avoid a typical globular shape preferred by proteins in aqueous solutions. The folding leads to a structure that exposes specific hydrophobic residues to the exterior of the molecule, thus facilitating protein-protein interactions. In addition to serving as ligands for different cell surface receptors, these cystine knot proteins are known to form homo- and heterodimers as well as multimers. Of interest, the function of the cystine knot motif in human mucin-2 appears to be the facilitation of dimer formation because its cystine knot-containing C terminus is cleaved away from the secreted protein after dimerization (48).

All families identified in this study contained exclusively extracellular proteins, which corroborates the importance of this signature for extracellular signaling and cell-cell communication. Most of the proteins identified are ligands to known receptors. The phylogenetic grouping of the genes reflects their interaction with different types of receptors (Fig. 8Go). The TGFß family forms a distinct subgroup and the members are known to interact with serine/threonine kinase, single-transmembrane receptors. Of interest is the alignment of the lefty gene, which is more closely related to the TGFß subfamily than to the other subgroups. In contrast to the TGFß subgroup, the sequence of lefty is not terminated at the second residue after the sixth cysteine. The position of this stop codon is, however, highly conserved among the TGFß members and seems to be essential for receptor interaction as mutants with extended C terminus display a dominant negative behavior (49). Although the functional pathway of lefty is not known, it could constitute a potential antagonist to the TGFs. The GPH and PDGF subfamily members signal through seven-transmembrane G protein-coupled receptors and tyrosine-kinase single-transmembrane receptors, respectively. Members of the mucin-like subgroup are extracellular matrix proteins and have not been found to interact with specific cell surface receptors. Furthermore, the BMP antagonists likely interact with other ligands, whereas members of the slit-like subgroup interact with single-transmembrane protein receptors involved in axon guidance (50).


    CONCLUSION
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 
A combination of bioinformatic approaches was used to exhaustively survey the genomic data of five different model species and identify potential members of the cystine knot superfamily. In addition to demonstrating relationships between different family members, new orthologs in different species were recognized. Furthermore, by a comparison of the large number of known cystine knot members, new consensus residues in the cystine knot signature were discovered, thus improving future identification of cystine knot proteins. The spectrum of members of the cystine knot superfamily was expanded by including the mucin like subfamily and the slit-like proteins. Because all the subfamilies that were found contained exclusively extracellular proteins, the present findings underscore the importance of the cystine knot structure for ligand-receptor interaction and cell-cell communication. This is corroborated by the absence of cystine knot structures in unicellular yeast and the presence of multiple subfamily members in nematode, indicating that this ancient structure evolved parallel with the development of multicellular life.

With the availability of the complete genomes of more than 60 organisms, including the human genome, bioinformatic analyses of extracellular signaling molecules are essential to provide a global perspective on the evolution and structural features of different protein hormone families. The present minireview represents an initial attempt in this direction to provide the basis for discovering new human protein hormone paralogs and for understanding the structural characteristics important for hormone function.


    ACKNOWLEDGMENTS
 
We thank Caren Spencer for editorial support and John Carlsson for assistance in computational tasks.


    FOOTNOTES
 
Address requests for reprints to: Aaron J. W. Hsueh, Division of Reproductive Biology, Department of Gynecology and Obstetrics, Stanford University School of Medicine, Stanford, California 94305-5317. E-mail: aaron.hsueh{at}stanford.edu

U.A.V. is supported by a fellowship from the Lalor Foundation.

Received for publication January 12, 2001. Revision received February 12, 2001. Accepted for publication February 20, 2001.


    REFERENCES
 TOP
 ABSTRACT
 STRUCTURAL FEATURES OF KNOWN...
 BIOINFORMATIC APPROACH TO TRACE...
 PROTEINS WITH A TYPICAL...
 FROM CYSTINE KNOT SIGNATURE...
 PROTEINS EXCLUDED TO HAVE...
 UNIQUE FEATURES IDENTIFIED IN...
 THE CYSTINE KNOT SIGNATURE...
 EVOLUTION OF THE 10-MEMBERED...
 THE ABSENCE OF CYSTINE...
 MOST CYSTINE KNOT PROTEINS...
 CONCLUSION
 REFERENCES
 

  1. Tamaoki H, Kobayashi Y, Nishimura S, Ohkubo T, Kyogoku Y, Nakajima K, Kumagaye S, Kimura T, Sakakibara S 1991 Solution conformation of endothelin determined by means of 1H-NMR spectroscopy and distance geometry calculations. Protein Eng 4:509–518[Abstract]
  2. Kobayashi Y, Takashima H, Tamaoki H, Kyogoku Y, Lambert P, Kuroda H, Chino N, Watanabe TX, Kimura T, Sakakibara S 1991 The cystine-stabilized {alpha}-helix: a common structural motif of ion-channel blocking neurotoxic peptides. Biopolymers 31:1213–1220[Medline]
  3. Tamaoki H, Miura R, Kusunoki M, Kyogoku Y, Kobayashi Y, Moroder L 1998 Folding motifs induced and stabilized by distinct cystine frameworks. Protein Eng 11:649–659[Abstract]
  4. McDonald NQ, Hendrickson WA 1993 A structural superfamily of growth factors containing a cystine knot motif. Cell 73:421–424[Medline]
  5. Isaacs NW 1995 Cystine knots. Curr Opin Struct Biol 5:391–395[CrossRef][Medline]
  6. Sun PD, Davies DR 1995 The cystine-knot growth-factor superfamily. Annu Rev Biophys Biomol Struct 24:269–291[CrossRef][Medline]
  7. Scheufler C, Sebald W, Hulsmeyer M 1999 Crystal structure of human bone morphogenetic protein-2 at 2.7 A resolution. J Mol Biol 287:103–115[CrossRef][Medline]
  8. Griffith DL, Keck PC, Sampath TK, Rueger DC, Carlson WD 1996 Three-dimensional structure of recombinant human osteogenic protein 1: structural paradigm for the transforming growth factor ß superfamily. Proc Natl Acad Sci USA 93:878–883[Abstract/Free Full Text]
  9. Butler SA, Laidler P, Porter JR, Kicman AT, Chard T, Cowan DA, Iles RK 1999 The beta-subunit of human chorionic gonadotrophin exists as a homodimer. J Mol Endocrinol 22:185–192[Abstract/Free Full Text]
  10. Prestrelski SJ, Arakawa T, Duker K, Kenney WC, Narhi LO 1994 The conformational stability of a non-covalent dimer of a platelet-derived growth factor-B mutant lacking the two cysteines involved in interchain disulfide bonds. Int J Pept Protein Res 44:357–363[Medline]
  11. Hui JO, Woo G, Chow DT, Katta V, Osslund T, Haniu M 1999 The intermolecular disulfide bridge of human glial cell line-derived neurotrophic factor: its selective reduction and biological activity of the modified protein. J Protein Chem 18:585–593[CrossRef][Medline]
  12. Zimmerman CM, Padgett RW 2000 Transforming growth factor ß signaling mediators and modulators. Gene 249:17–30[CrossRef][Medline]
  13. Mather JP, Moore A, Li RH 1997 Activins, inhibins, and follistatins: further thoughts on a growing family of regulators. Proc Soc Exp Biol Med 215:209–222[Abstract]
  14. Wozney JM, Rosen V 1998 Bone morphogenetic protein and bone morphogenetic protein gene family in bone formation and repair. Clin Orthop 346:26–37[Medline]
  15. Massague J, Blain SW, Lo RS 2000 TGFß signaling in growth control, cancer, and heritable disorders. Cell 103:295–309[Medline]
  16. Roelen BA, Mummery CL 2000 Transforming growth factor-[betas] in pre-gastrulation development of mammals (minireview). Mol Reprod Dev 56:220–226[CrossRef][Medline]
  17. Gaddy-Kurten D, Tsuchida K, Vale W 1995 Activins and the receptor serine kinase superfamily. Recent Prog Horm Res 50:109–129[Medline]
  18. Melton DA 1991 Pattern formation during animal development. Science 252:234–241[Medline]
  19. Roberts AB 1998 Molecular and cell biology of TGF-ß. Miner Electrolyte Metab 24:111–119[CrossRef][Medline]
  20. Josso N, Racine C, di Clemente N, Rey R, Xavier F 1998 The role of anti-Mullerian hormone in gonadal development. Mol Cell Endocrinol 145:3–7[CrossRef][Medline]
  21. Hart CE, Bowen-Pope DF 1990 Platelet-derived growth factor receptor: current views of the two-subunit model. J Invest Dermatol 94:53–57
  22. Ryu KS, Ji I, Chang L, Ji TH 1996 Molecular mechanism of LH/CG receptor activation. Mol Cell Endocrinol 125:93–100[CrossRef][Medline]
  23. Stanley E, Biben C, Kotecha S, Fabri L, Tajbakhsh S, Wang CC, Hatzistavrou T, Roberts B, Drinkwater C, Lah M, Buckingham M, Hilton D, Nash A, Mohun T, Harvey RP 1998 DAN is a secreted glycoprotein related to Xenopus cerberus. Mech Dev 77:173–184[CrossRef][Medline]
  24. Meitinger T, Meindl A, Bork P, Rost B, Sander C, Haasemann M, Murken J 1993 Molecular modelling of the Norrie disease protein predicts a cystine knot growth factor tertiary structure. Nat Genet 5:376–380[Medline]
  25. Katsumi A, Tuley EA, Bodo I, Sadler JE 2000 Localization of disulfide bonds in the cystine knot domain of human von Willebrand factor. J Biol Chem 275:25585–25594[Abstract/Free Full Text]
  26. Ren P, Lim CS, Johnsen R, Albert PS, Pilgrim D, Riddle DL 1996 Control of C. elegans larval development by neuronal expression of a TGF-ß homolog. Science 274:1389–1391[Abstract/Free Full Text]
  27. Hoffmann FM 1992 TGF-ß family factors in Drosophila morphogenesis. Mol Reprod Dev 32:173–178[Medline]
  28. Kawabata M, Imamura T, Miyazono K 1998 Signal transduction by bone morphogenetic proteins. Cytokine Growth Factor Rev 9:49–61[CrossRef][Medline]
  29. Chervitz SA, Aravind L, Sherlock G, Ball CA, Koonin EV, Dwight SS, Harris MA, Dolinski K, Mohr S, Smith T, Weng S, Cherry JM, Botstein D 1998 Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 282:2022–2028[Abstract/Free Full Text]
  30. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402[Abstract/Free Full Text]
  31. Reese MG, Kulp D, Tammana H, Haussler D 2000 Genie–gene finding in Drosophila melanogaster. Genome Res 10:529–538[Abstract/Free Full Text]
  32. Van den Steen P, Rudd PM, Dwek RA, Opdenakker G 1998 Concepts and principles of O-linked glycosylation. Crit Rev Biochem Mol Biol 33:151–208[Abstract]
  33. Kim YS, Gum Jr JR 1995 Diversity of mucin genes, structure, function, and expression. Gastroenterology 109:999–1001[Medline]
  34. Axelsson MA, Asker N, Hansson GC 1998 O-glycosylated MUC2 monomer and dimer from LS 174T cells are water-soluble, whereas larger MUC2 species formed early during biosynthesis are insoluble and contain nonreducible intermolecular bonds. J Biol Chem 273:18864–18870[Abstract/Free Full Text]
  35. Perez-Vilar J, Eckhardt AE, DeLuca A, Hill RL 1998 Porcine submaxillary mucin forms disulfide-linked multimers through its amino-terminal D-domains. J Biol Chem 273:14442–14449[Abstract/Free Full Text]
  36. Belo JA, Bachiller D, Agius E, Kemp C, Borges AC, Marques S, Piccolo S, De Robertis EM 2000 Cerberus-like is a secreted BMP and nodal antagonist not essential for mouse development. Genesis 26:265–270[CrossRef][Medline]
  37. Piccolo S, Agius E, Leyns L, Bhattacharyya S, Grunz H, Bouwmeester T, De Robertis EM 1999 The head inducer Cerberus is a multifunctional antagonist of Nodal, BMP and Wnt signals. Nature 397:707–710[CrossRef][Medline]
  38. Live DH, Williams LJ, Kuduk SD, Schwarz JB, Glunz PW, Chen XT, Sames D, Kumar RA, Danishefsky SJ 1999 Probing cell-surface architecture through synthesis: an NMR-determined structural motif for tumor-associated mucins. Proc Natl Acad Sci USA 96:3489–3493[Abstract/Free Full Text]
  39. Aksoy N, Thornton DJ, Corfield A, Paraskeva C, Sheehan JK 1999 A study of the intracellular and secreted forms of the MUC2 mucin from the PC/AA intestinal cell line. Glycobiology 9:739–746[Abstract/Free Full Text]
  40. Bell SL, Khatri IA, Xu G, Forstner JF 1998 Evidence that a peptide corresponding to the rat Muc2 C-terminus undergoes disulphide-mediated dimerization. Eur J Biochem 253:123–131[Abstract]
  41. Asker N, Baeckstrom D, Axelsson MA, Carlstedt I, Hansson GC 1995 The human MUC2 mucin apoprotein appears to dimerize before O-glycosylation and shares epitopes with the ‘insoluble’ mucin of rat small intestine. Biochem J 308:873–880[Medline]
  42. Richter M, Gottanka J, May CA, Welge-Lussen U, Berger W, Lutjen-Drecoll E 1998 Retinal vasculature changes in Norrie disease mice. Invest Ophthalmol Vis Sci 39:2450–2457[Abstract]
  43. Gralnick HR, Coller BS, Sultan Y 1976 Carbohydrate deficiency of the factor VIII/von Willebrand factor protein in von Willebrand’s disease variants. Science 192:56–59[Medline]
  44. Di Noia JM, Sanchez DO, Frasch AC 1995 The protozoan Trypanosoma cruzi has a family of genes resembling the mucin genes of mammalian cells. J Biol Chem 270:24146–24149[Abstract/Free Full Text]
  45. Craik DJ, Daly NL, Waine C 2001 The cystine knot motif in toxins and implications for drug design. Toxicon 39:43–60[CrossRef][Medline]
  46. Marker PC, Seung K, Bland AE, Russell LB, Kingsley DM 1997 Spectrum of Bmp5 mutations from germline mutagenesis experiments in mice. Genetics 145:435–443[Abstract/Free Full Text]
  47. Brunner AM, Lioubin MN, Marquardt H, Malacko AR, Wang WC, Shapiro RA, Neubauer M, Cook J, Madisen L, Purchio AF 1992 Site-directed mutagenesis of glycosy-lation sites in the transforming growth factor-ß 1 (TGFß 1) and TGF ß 2 (414) precursors and of cysteine residues within mature TGFß 1: effects on secretion and bioactivity. Mol Endocrinol 6:1691–1700[Abstract]
  48. Xu G, Bell SL, McCool D, Forstner JF 2000 The cationic C-terminus of rat Muc2 facilitates dimer formation post translationally and is subsequently removed by furin. Eur J Biochem 267:2998–3004[Abstract/Free Full Text]
  49. Kishimoto Y, Lee K-H, Zon L, Hammerschmidt M, Schulte-Merker S 1997 The molecular nature of zebrafish swirl: BMP2 function is essential during early dorsoventral patterning. Development 124:4457–4466[Abstract/Free Full Text]
  50. Yuan W, Zhou L, Chen JH, Wu JY, Rao Y, Ornitz DM 1999 The mouse SLIT family: secreted ligands for ROBO expressed in patterns that suggest a role in morphogenesis and axon guidance. Dev Biol 212:290–306[CrossRef][Medline]
  51. Smith RF, Smith TF 1990 Automatic generation of primary sequence patterns from sets of related protein sequences. Proc Natl Acad Sci USA 87:118–122[Abstract]
  52. Smith RF, Smith TF 1992 Pattern-Induced multi- sequence Alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for comparative protein modelling. Protein Eng 5:35–41[Abstract]
  53. Fitch WM 1971 Toward defining the course of evolution: minimum change for a specified tree topology. System Zool 20:406–416
  54. Dayhoff MO 1979 Atlas of Protein Sequence and Structure. National Biomedical Research Foundation, Washington, DC, vol 5, pt 3, p 134
  55. Felsenstein J 1993 PHYLIP (Phylogeny Inference Package) version 3:5.c. Department of Genetics, University of Washington, Seattle, WA
  56. Fitch WM, Margoliash E 1967 Construction of phylogenetic trees. Science 155:279–284[Medline]
  57. Felsenstein J 1985 Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791
  58. Efron B, Halloran E, Holmes S 1996 Bootstrap confidence levels for phylogenetic trees. Proc Natl Acad Sci USA 93:13429–13434[Abstract/Free Full Text]