Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560 012, India
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() |
---|
Keywords: interfaces/legume lectin/phyletic tree from crystal structures/quaternary structure/sequence analysis
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() |
---|
Legume lectins of known three-dimensional structure, their oligomeric state and carbohydrate specificity are listed in Table I. As illustrated in Figure 1
, the protomer of each is made up of a six-stranded nearly flat back ß-sheet, a seven-stranded curved front ß-sheet, a short five-membered ß-sheet at the top of the molecule and several loops that connect the sheets. All of them are dimers or tetramers that can be considered as dimers of dimers. Each tetramer has three types of interfaces. These interfaces have varying degrees of similarity, ranging from very close to broad, with those found in the dimeric proteins. All these interfaces involve the six-stranded back ß-sheet of the monomer in one way or the other and it is possible to describe each of them in terms of the mutual disposition of the sheets in the two participating subunits. The observed modes of quaternary association have been rationalized in terms of hydrophobic surface buried on oligomerization, interaction energy and shape complementarity (Prabu et al., 1999
). Dimerization in a majority of instances involves a side-by-side arrangement, resulting in a contiguous 12-stranded ß-sheet with the dyad axis perpendicular to the ß-sheet. This kind of association first observed in ConA (Hardman and Ainsworth, 1972
) may be described as II-type (Jones and Thornton, 1995
). All abbreviations of lectin names are listed in Table II
. Dimerization in other instances involves different kinds of back-to-back association of the six-stranded ß-sheets (named as X1, X2, X3 and X4 types). Various types of dimeric associations that are observed in the structures of legume lectins are schematically shown in Figure 2
. Among the dimeric lectins, PSL (Einspahr et al., 1986
), Favin (Reeke and Becker, 1986
), LOLI (Bourne et al., 1990
), LENL (Loris et al., 1993
) and UEAI (Audette et al., 2000
) associate in II-type fashion (Figures 2 and 3a
). In the 10 tetramers of known structure except PNA (ConA, AZD, DIAB, DGL, SBA, PHAL, UEAII, DBL and MAL), subunits 1 and 2, and 3 and 4 associate in a side-by-side fashion (II-type) (Figures 4 and 5
). All these tetramers can be considered as resulting from II-type associations of X-type dimers. The 14 and the 23 interfaces in SBA (Dessen et al., 1995
), PHAL (Hamelryck et al., 1996
), UEAII (Dao-thi et al., 1998
), DBL (Hamelryck et al., 1999
) and MAL (Imberty et al., 2000
) are of one kind of back-to-back type (X1-type) while those of ConA, AZD (Sanz-Aparicio et al., 1997
), DIAB (Protein Data Bank code: 1QMO) and DGL (Rozwarski et al., 1998
) form another type (X2-type). DB58 (Hamelryck et al., 1999
), a lectin closely related to DBL but dimeric in nature exhibits the X1-type back-to-back interface (Figures 2 and 3b
). PNA represents a unique case of a tetramer without 4-fold or 222 symmetry (Banerjee et al., 1994
,1996
). Consequently, the 12 and the 34 interfaces are not equivalent. It is believed that the 34 interface is an incidental consequence of the presence of two dimers with an X4-type interface (14 and 23) associating with one II-type interface (12). Although 12 is a side-by-side interface, the two six-membered sheets do not form a contiguous 12-stranded ß-sheet, but are connected through a number of interfacial water molecules. The dimeric lectins, EcorL (Shaanan et al., 1991
), WBAI (Prabu et al., 1998
) and WBAII (Manoj et al., 2000
) exhibit one kind of back-to-back association (X3-type) while GS4 (Delbaere et al., 1993
) exhibits an interface (X4-type) similar to that in PNA (Figure 2 and 3d, e
). Thus, all the oligomerization modes observed so far in legume lectins can be explained in terms of the formation of two classes of dimers (II-type and X-type) and further association of two of these dimers into tetramers. It is interesting that although there are four different kinds of X-type interfaces, in all cases the majority of the inter-subunit contacts come from the same fourth, fifth and sixth strands of the back ß-sheet and a few additional contacts in each type coming from residues elsewhere in the back ß-sheet.
|
|
|
|
|
|
|
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() |
---|
The list of all the legume lectin structures whose coordinates are available was obtained from the 3D Lectin Data Bank on World Wide Web URL: http://www.cermav.cnrs.fr/databank/lectine. The coordinates were obtained from the Protein Data Bank (Berman et al., 2000) and the sequences from the SWISSPROT data bank (Bairoch and Apweiler, 1997
).
Comparisons of legume lectins available in the Protein Data Bank
Multiple alignment of sequences.
The multiple sequence alignment was performed using the program MULTALIGN from the AMPS suite of programs (Barton, 1990). This program uses the Needleman and Wuncsh algorithm (; Barton and Sternberg, 1987
) with a fixed gap penalty of 8 and the Dayhoff's mutation data matrix (Schwartz and Dayhoff, 1978
). There were 19 sequences of legume lectins whose coordinates were available in the Protein Data Bank (Table I
). The program ORDER was used to perform cluster analysis and ordering of the sequences by similarity and to construct a dendrogram from the output from a MULTALIGN pairwise run. The cluster analysis uses the significance scores for the alignment calculated from the mean random score, the match score and standard deviation (SD) score of randomizations to generate a tree file. The lengths of the branches were adjusted to reflect the pairwise alignment scores.
Multiple alignment of sequence based on structures
The alignment of sequences based on three-dimensional structures was performed using the program STAMP (STructural Alignment of Multiple Proteins) (Russel and Barton, 1992). STAMP makes use of the rigid body least squares superposition of C
positions (Rossmann and Argos, 1975
) for expressing the probability of equivalence of residue structural equivalence. A preliminary multiple sequence alignment is performed using sequence information, which then determines an initial superposition of the structures. A structure comparison algorithm is applied to all pairs of proteins in the superimposed set and a similarity tree calculated. Multiple sequence alignments are then generated by following the tree from the branches to the root. At each branch point of the tree, a structure-based sequence alignment and coordinate transformations are output, with the multiple alignment of all structures output at the root. The tree topologies and branch lengths for the phylogenetic tree were determined from the sequence distance matrices using the program KITSCH from the PHYLIP suite of programs (Felsenstein, 1985
). This method accounts for unequal rates of change among the proteins by adjusting distances so that the branch lengths, from the root of the tree to the tip of each of its leaves, are equidistant.
Comparisons of legume lectins available in the sequence database
An analysis of the sequences of legume lectins for which some information about their carbohydrate specificity and/or quaternary structures are available in the sequence databases was also performed. The list of 33 lectins and their sources and specificities are given in Table II. These sequences were aligned and their phylogenetic tree constructed using the programs available in the Wisconsin Sequence Analysis package [Wisconsin Package Version 9.1, Genetics Computer Group (GCG), Madison, WI]. The program PILEUP was used for the multiple sequence alignment and the program PAUPSEARCH (Phylogenetic Analysis using Parsimony) was used for constructing a phylogenetic tree starting from an aligned sequence that is optimal according to parsimony criteria. The program constructs a neighbour-joining tree and the best tree is the one with the minimum sum of branch lengths based on a corrected distance matrix calculated from the aligned sequences. Confidence values were evaluated using bootstrapping replications and a consensus bootstrap tree was obtained. The program PAUPDISPLAY was used to plot the tree.
Results and discussion
The alignment scores as expressed in units of SD above the mean background obtained for comparison of unrelated sequences of identical length and amino acid composition calculated by MULTALIGN ranges from 15.1 to 26.6 SD for all the pairs of sequences. The total number of residues conserved in all sequences is 69, out of which 21 residues are identical. There are nine sites where significant gaps have been introduced. Figure 6 shows the unrooted dendrogram obtained on the basis of the sequence alignment. The lengths of the branches represent the approximate evolutionary distance between the sequences. The tree has six major divisions: (1) X1-type, (2) X3-type, (3) II-type, (4) II,X2-type, (5) II,X4-type and (6) X4-type. The first one includes tetrameric lectins SBA, PHAL, DBL, UEAII and MAL and a dimeric lectin DB58. All the lectins in this branch are made up of X1-type dimers and most of them are specific to Gal/GalNAc at the monosaccharide level. One of the sub-branches consists of UEAII and MAL which have unique specificities. The second major branch includes Gal/GalNAc-binding lectins WBAI, WBAII and EcorL. These lectins are dimers of the X3-type. Branch 3 consists of lectins LENL, Favin, LOLI and PSL that are Man/Glc binding and are dimers of the II type. Branch 4 has tetrameric lectins made up of X2-type dimers. Again, all these lectins are Man/Glc specific. DIAB should have come closer to the ConA-type of tetramers but probably having a two-chain subunit unlike the others, is placed slightly away. Branch 5 consists of PNA, a tetramer of a unique type and is specific to Gal made up of two X4-type dimers. Branch 6 consists of GS4, which binds to complex carbohydrates by accommodating a GalNAc in the primary-binding site and is an X4-type dimer. These two branches are evolutionarily as distant from each other as they are from the others. The major branches are related to the classification of quaternary structures. Further, the branches also appear to reflect the functional divergence of these lectins in terms of their carbohydrate specificity. The clustering we have derived thus shows a strong correlation to the structural classification which most likely would have evolved to reflect biological activity of legume lectins.
|
|
|
|
|
Role of amino acid residues in determining the oligomerization in legume lectins
As an extension of the above analysis, a detailed examination of the sequences at the various interfaces was performed using the clustering information obtained from the alignment of sequences based on structures, with the objective to pinpoint any residues from the sequences that are conserved within each of the clusters and may probably be crucial either for the formation of certain types of interfaces or prevent the formation of other types of interfaces. The role of the identified residues in the formation of an interface was then examined in the three-dimensional structure by generating the relevant interfaces. Interface residues are defined as those residues whose accessible surface area decreases by greater than 1 Å2 on oligomerization. All numbers referred to hereafter correspond to the numbering given in Figure 8. According to this numbering, the residue ranges in the third, fourth, fifth and sixth strands in the back ß-sheet are 8086, 199205, 210216 and 224230, respectively.
II-Type interfaces
In those lectins that do not form a II-type interface, the amino acid residue at position 66 is charged (Lys, Lys, Glu, Glu in EcorL, WBAI, WBAII, GS4, respectively). In the case of PNA, which does not have a strict II-type interface, the site is occupied by Met, a large hydrophobic residue. Indeed, modelling of these lectins into a II-type interface shows severe short contacts and burial of these amino acid residues. The amino acids at this position in all the other lectins with a II-type interface are those with small polar or non-polar side chains (Ser, Thr or Ala) (Figure 11a). They are also involved in van der Waals or hydrogen-bonded interactions at the II-type interface. There are other sequence differences between the II-type and X-type classes in this stretch of sequence. A conserved charged residue in the WBA group (with X3-type interface) at position 14 (Glu, His, Glu in WBAII, WBAI and EcorL, respectively) comes in close contact with the charged amino acid at position 3 (Glu, Lys and Glu in WBAII, WBAI and EcorL, respectively) making unfavourable interactions in a II-type interface. Similarly in PNA and GS4, charged residues (Arg and Lys, respectively) at position 241 make short contacts and get buried between amino acid residues at positions 17 and 21 in a II-type interface. From the analysis, it appears that the residue at position 66 is completely discriminatory and can act as a switch preventing the formation of a II-type interface. This information can be used to predict whether a II-type interface is possible for a lectin. For example, it can be predicted from a sequence alignment that BPL which has an Arg at this position will not form a II-type interface while LTA or GS2 which have Thr/Ser will probably form a II-type interface.
|
Prabu et al. have shown that the X2-, X3- and X4-type dimers can be generated from each other by a rotation of one subunit with respect to the other about an axis perpendicular to the plane of the dyads (Prabu et al., 1999). A comparison of the residues involved in inter-subunit contacts of each X-type interface and the corresponding regions in the other lectins was performed. The sequence alignment revealed residues which are unique to interfaces of a particular type. For example, in the X3-type interface (WBAI, EcorL and WBAII), the amino acids Arg and Lys at positions 84 and 203 are unique to this group of lectins and in fact, both these residues make strong hydrogen-bonding interactions across the interface. The Arg84 and Lys203 belong to the third and fourth ß-strands of the back ß-sheet, respectively, and possibly facilitate the formation of this kind of interface. Alternatively, the presence of an Arg at 210 in all the lectins of the ConA group (X2-type) could prevent the formation of an X3-type of interface by this group. Modelling of an X3-type interface using lectins of this group resulted in short contacts between two Arg210 residues related by a 2-fold axis (Figure 11b
).
The X4-type interface provides for a large number of inter-subunit contacts. In fact, residues from all the six ß-strands of the back ß-sheet participate in the dimer formation. Within this group comprising of PNA and GS4, the interfaces are not exactly identical; there is a small rotation of the subunits relative to each other (Prabu et al., 1999). A comparison of the relevant stretches of sequences and modelling of a GS4 type of interface using PNA revealed that Leu82 in PNA leads to severe short contacts with Ile223 that gets relieved in the actual PNA interface, while in GS4 the corresponding residues are Tyr and Asp which are involved in good van der Waals interactions. A comparison of the sequences of the SBA group (X1-type) showed that at position 210 a Leu is present in four of the six sequences (SBA, PHAL, DBL and DB58). Generation of an X4-type dimer using this group of lectins showed short contacts of this residue with its 2-fold related one (Figure 11c
). The fifth lectin in this group, UEAII, has a Ser at this position. But an Arg at position 205 that is unique to this lectin gets buried when in an X4-type interface. In the ConA group (X2-type), the residue at position 210 is an Arg that gets buried when an X4-type interface is generated (Figure 11d
). This residue of the ConA group also gets buried when an X1-type interface is generated. The corresponding region in the WBA group reveals a unique Lys at position 203 that makes unacceptable steric contacts and gets buried in an X4-type (Figure 11e
) or X1-type (Figure 11f
) interface. As discussed earlier, this residue is also responsible for making favourable interactions in the X3-type interface of the WBA group. Thus, it appears that the location of Lys at this position could be responsible for the formation of the native dimer and for preventing the formation of other kinds of interfaces. Similarly, it appears that the location of Arg at 210 for the ConA group could most probably be responsible for this group of lectins not forming an X1-, X3- or X4-type interface. All the residues discussed above belong to one of the three strands of the back ß-sheet that are common to X-type interfaces.
Obviously it is not possible to point out from the sequence alignment, the particular amino acid residues that are involved in the formation or prevention of all the four types of X-type interfaces. Although the formation of an interface is the result of the cumulative effect of all the residues present in the interface, the above analysis shows that at least in some cases, crucial residues responsible for oligomerization can probably be identified from the alignment of sequences. This information can provide a basis for mutational studies to evaluate the role of key amino acid residues responsible for variations in modes of oligomerization.
![]() |
Notes |
---|
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() |
---|
Bajaj,M. and Blundell,T. (1984) Annu. Rev. Biophys. Bioeng., 13, 453492.[ISI][Medline]
Bairoch,A. and Apweiler,R. (1997) Nucleic Acids Res., 25, 3136.
Banerjee,R., Mande,S.C., Ganesh,V., Das,K., Dhanaraj,V., Mahanta,S.K, Suguna,K., Surolia,A. and Vijayan,M. (1994) Proc. Natl Acad. Sci. USA, 91, 227231.[Abstract]
Banerjee,R., Das,K., Ravishankar,R., Suguna,K., Surolia,A. and Vijayan,M. (1996) J. Mol. Biol., 259, 281296.[ISI][Medline]
Barton,G.J. (1990) Methods Enzymol., 183, 403428.[ISI][Medline]
Barton,G.J. (1993) Protein Eng., 6, 3740.[ISI][Medline]
Barton,G.J. and Sternberg,M.J. (1987) Protein Eng., 1, 8994.[Abstract]
Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne,P.E. (2000) Nucleic Acids Res., 28, 235242.
Bouckaert,J., Hamelryck,T., Wyns,L. and Loris,R. (1999) Curr. Opin. Struct. Biol., 9, 572577.[ISI][Medline]
Bourne,Y., Abergei,C., Cambillau,C., Frey,M., Rouge,P. and Fontecilla-Camps,J.C. (1990) J. Mol. Biol., 214, 571584.[ISI][Medline]
Cheng,W., Bullit,E., Bhattacharyya,L., Brewer,C.F. and Makowski,L. (1998) J. Biol. Chem., 273, 3501635022
Dao-Thi,M.H., Rizkallah,P., Wyns,L., Poortmans,F. and Loris,R. (1998) Acta Crystallogr., D54, 844847.
Dessen,A., Gupta,D., Sabesan,S., Brewer,C.F. and Sacchettini,J.C. (1995) Biochemistry, 34, 49334942.[ISI][Medline]
Delbaere,L.T.J., Vandonselaar,M., Prasad,L., Quail,J.W., Wilson,K.S. and Dauter,Z. (1993) J. Mol. Biol., 230, 950965.[ISI][Medline]
Einspahr,H., Parks,E.H., Suguna,K., Subramanian,E. and Suddath,F.L. (1986) J. Biol. Chem., 261, 1651816527.
Felsenstein,J. (1985) Evolution, 39, 783791.[ISI]
Hardman,K.D. and Ainsworth,C.F. (1972) Biochemistry, 11, 49104919.[ISI][Medline]
Hamelryck,T.W., Dao-Thi,M.H., Poortsmans. F., Chrispeels,M.J., Wyns,L. and Loris,R. (1996) J. Biol. Chem., 271, 2047920485.
Hamelryck,T.W., Loris,R., Bouckaert,J., Dao-Thi,M.H., Strecker,G., Imberty,A., Fernandez,E., Wyns,L. and Etzler,M.E. (1999) J. Mol. Biol., 286, 11611177.[ISI][Medline]
Imberty,A., Gautier,C., Lescar,J., Peréz,S., Wyns,L. and Loris,M. (2000) J. Biol. Chem., 275, 1754117548.
Jones,S. and Thornton,J.M. (1995) Prog. Biophys. Mol. Biol., 63, 3165.[ISI][Medline]
Kraulis,P. (1991) J. Appl. Crystallogr., 24, 946950.[ISI]
Lis,H. and Sharon,N. (1998) Chem. Rev., 98, 637674.[ISI][Medline]
Livingstone,C.D. and Barton,G.J. (1993) CABIOS, 9, 745756.[Abstract]
Loris,R., Steyaert,J., Maes,D., Lisgarten,J., Pickersgill,R. and Wyns,L. (1993) Biochemistry, 32, 87728781.[ISI][Medline]
Loris,R., Hamelryck,T., Bouckert,J. and Wyns,L. (1998) Biochim. Biophys. Acta., 1383, 936.[ISI][Medline]
Manoj,N., Srinivas,V.R., Surolia,A., Vijayan,M. and Suguna,K. (2000) J. Mol. Biol., 302, 11291137.[ISI][Medline]
Needleman,S.B. and Wunsch,C.D. (1970) J. Mol. Biol., 48, 443453.[ISI][Medline]
Prabu,M.M., Sankaranarayanan,R., Puri,K.D., Sharma,V., Surolia,A., Vijayan,M. and Suguna,K. (1998) J. Mol. Biol., 276, 787796.[ISI][Medline]
Prabu,M.M., Suguna,K. and Vijayan,M. (1999) Proteins: Struct. Funct. Genet., 13, 112.[ISI][Medline]
Reeke,G.N. and Becker,J.W. (1986) Science, 234, 11081111.[ISI][Medline]
Rossmann,M.G. and Argos,P. (1975) J. Biol. Chem., 250, 75257532.[Abstract]
Rozwarski,D.A., Swami,B.M., Brewer,C.F. and Sacchettini,J.C. (1998) J. Biol. Chem., 273, 3281832825.
Russel,R.B. and Barton,G.J. (1992) Proteins: Struct. Funct. Genet., 14, 309323.[ISI][Medline]
Sanz-Aparicio,J., Hermoso,J., Grangeiro,T.B., Calvete,J.J. and Cavada,B.S. (1997) FEBS Lett., 405, 114118.[ISI][Medline]
Schwartz,R.M. and Dayhoff,M.O. (1978) In Dayhoff,M.O. (ed.), Atlas of Protein Sequence and Structure, Vol. 5. National Biomedical Research Foundation, Washington DC, pp. 353358.
Shaanan,B., Lis,H. and Sharon,N. (1991) Science, 254, 862866.[ISI][Medline]
Sharma,V. and Surolia,A. (1997) J. Mol. Biol., 267, 433445.[ISI][Medline]
Swamy,M.J., Sastry,M.V.K. and Surolia,A. (1985) J. Biosci., 9, 203212.[ISI]
Vijayan,M. and Chandra,N. (1999) Curr. Opin. Struct. Biol., 9, 707714.[ISI][Medline]
Weis,W.I. and Drickamer,K. (1996) Annu. Rev. Biochem., 65, 441473.[ISI][Medline]
Young, N,M. and Oomen,R.P. (1992) J. Mol. Biol., 228, 924934.[ISI][Medline]
Received November 21, 2000; revised July 19, 2001; accepted July 31, 2001.