1Department of Physiology and Biophysics and 2Institute for Computational Biomedicine, Weill Medical College of Cornell University, 1300 York Avenue, New York, NY 10021, USA
3 To whom correspondence should be addressed. E-mail: haw2002{at}med.cornell.edu
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: electron microscopy/membrane proteins/MFS transporter/molecular modeling
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Recent progress in the field of electron microscopy (EM) regarding image collection and data processing has allowed the structure determination of several MPs at intermediate resolution (7 Å), thus establishing this approach as a valuable substitute for data from X-ray crystallography for such systems (Unger, 2001
; Subramaniam et al., 2002
). MPs for which EM has permitted structure determination to a resolution where individual TM domains can be discerned include rhodopsin (Unger et al., 1997
), halorhodopsin (Havelka et al., 1995
), bacteriorhodopsin (Henderson et al., 1990
), aquaporin (Murata et al., 2000
; Ren et al., 2001
), the protein translocase SecYEG (Breyton et al., 2002
), the oxalate transporter OxlT (Hirai et al., 2002
; Heymann et al., 2003
), the microsomal glutathione transferase MGST1 (Schmidt-Krey et al., 2000
), the multidrug transporter EmrE (Ubarretxena-Belandia et al., 2003
) and the NhaA transporter (Williams, 2000
).
In the absence of structural data at or near atomic resolution, key information available from low-resolution EM mapssuch as the location of the TM domainscan serve as a valuable guide for molecular modeling. The successful modeling of the TM domains of G-protein coupled receptors (GPCRs) from a low-resolution (7 Å) EM map of rhodopsin (e.g., see Ballesteros and Weinstein, 1995; Baldwin et al., 1997
; Unger et al., 1997
; Filizola et al., 1998
; Visiers et al., 2002
) indicates the utility of this approach; its validity is evidenced by the favorable comparison to the subsequently determined high-resolution structure (Palczewski et al., 2000
). The availability of a large body of experimental data and a large number of known sequences of GPCRs strengthened the usefulness of the template, enabling the development of a C
-carbon template with high similarity to the crystal structure (Baldwin et al., 1997
).
With the current availability of both low-resolution EM maps and subsequently determined high-resolution crystal structures for several MPs (e.g. rhodopsin, bacteriorhodopsin, aquaporin, MFS transporters, EmrE), it is timely to evaluate the potential of several molecular modeling techniques to complement low-resolution EM maps in the construction of 3D models of MPs. Here, we evaluated a protocol for EM data-supported modeling of MPs that consists of three steps: (1) the identification of TM domains from sequence, (2) the assignment of buried and lipid-exposed faces of the TM domains and (3) the assembly of the TM domains into a bundle based on geometric restraints obtained from the EM data. Many methods are available for step 1 and the parsing of an MP sequence into TM segments and loops is possible with 90% accuracy (Nilsson et al., 2000
; Krogh et al., 2001
; Bertaccini and Trudell, 2002
; Chen et al., 2002
). For step 2, we have recently reported an algorithm that predicts whether residues are located in the interior or the lipid-exposed surface of the MP with 7080% accuracy (Beuming and Weinstein, 2004
). This can serve to orient the TM domains based on the predictions and on the estimation of the helical axis from the EM map (step 3).
We report here that the application of the three protocol steps to model the TM domains of the 7-TM proteins rhodopsin, bacteriorhodopsin and halorhodopsin based on 7 Å resolution maps resulted in models within 3.1, 3.9 and 3.0 Å r.m.s.d. from the corresponding crystal structures. These results for the 7-TM MPs correspond well to those obtained with a similar method described recently (Fleishman et al., 2004
).
To test whether the protocol could be successfully applied to MPs with a different topology (for which low- and high-resolution structures were not both available simultaneously), we generated geometric constraints artificially from crystal structures, equivalent to those obtained from the EM data. Models of 12-TM MPs (ACRB, LacY and GlpT) generated from these artificial restraints using the three-step protocol were all found to be between 3.7 and 4.7 Å r.m.s.d. from the crystal structures.
Finally, we predicted the structure of the oxalate transporter of Oxalobacter formigenes. This is a member of the multi-facilitator superfamily (MFS) and has been the subject of both functional and structural studies (Fu et al., 2001; Kim et al., 2001
; Hirai et al., 2002
, 2003
; Ye and Maloney, 2002
; Heymann et al., 2003
). A projection structure of 3.4 Å was reported recently (Heymann et al., 2003
) that allowed a proposed assignment of the 12 electron densities in the map to the specific transmembrane domains in the sequence (Hirai et al., 2003
). This proposed topology for MFS transporters has been confirmed by the recent crystal structures of the lactose permease LacY (Abramson et al., 2003
) and the glycerol-6-phosphate transporter (Huang et al., 2003
). The progress in obtaining a high-resolution structure for OxlT, evident in recent publications (Heymann et al., 2003
; Hirai and Subramaniam, 2004
), suggests that the anticipated structural data will enable an unbiased assessment of the quality of the modeling protocol proposed here and we therefore present a model for the TM domains of OxlT constructed with this protocol. Given the recent progress in structure determination with EM, such a validated modeling protocol should have wide utility in modeling the many MPs for which atomic resolution structures are not yet available.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
EM densities are typically visualized in publications by a series of cross-sections normal to the membrane plane. The center of the helices can be determined at a particular position relative to the center of the map and connecting these centers can approximate the helix axis. For rhodopsin, bacteriorhodopsin and halorhodopsin, the original publications (Henderson et al., 1990; Havelka et al., 1995
; Unger et al., 1997
) contain images of cross-sections separated by 20 Å and only these two cross-sections were used to extract distance restraints for estimating the helix axis.
Because the number of MPs for which both low- and high-resolution structures are available is still limited, we also extracted restraints equivalent to those available from the EM maps from crystal structures. To obtain these artificial restraints, the membrane spanning part of the MP was taken to be the 25 Å part of the structure with the highest average hydrophobicity [using the Kyte and Doolittle (1982) scale]. The centers of mass of the first and last four residues were determined for the segment of the helix located within this 25 Å slab and these were subsequently used as the distance restraints. Applying this procedure to the crystal structures of rhodopsin, bacteriorhodopsin and halorhodopsin results in distance restraints similar to those obtained from the corresponding EM maps, as the average distance between the distance restraints is 2.1 Å. The models obtained using either the EM or X-ray derived restraints are very similar (r.m.s.d. <2.2 Å), as is the accuracy of the modelsmeasured as the r.m.s.d. to the crystal structure (see Table I).
|
The orientation of the TM domains relative to the membrane normal (and to each other) is an essential component in modeling helix pair interfaces. To this end, it is more useful to predict which residues are located at the center of the lipid bilayer, rather than the limits of the TM domains. For an ideal helix the z-coordinate of each residue in the TM is automatically determined if the central residue in a TM is correctly identified and the tilt value of the TM is known from EM data.
A number of TM algorithms were tested for the present protocol [TOPPRED2 (von Heijne, 1992), ORIENTM (Liakopoulos et al., 2001
), HMMTOP (Tusnady and Simon, 2001
), TMHMM (Sonnhammer et al., 1998
) and MEMSAT (Jones et al., 1994
)], evaluating their ability to predict the central residue for each of the TMs in six MPs: rhodopsin, bacteriorhodopsin, halorhodopsin, GlpT, LacY and the ACRB transporter (a total of 57 TMs). Consensus in prediction has previously been shown to be a successful approach to TM topology prediction (Nilsson et al., 2000
; Ikeda et al., 2002
) and a consensus system is used here to predict the central residue in each TM from the algorithms mentioned above. Comparison with the crystal structures revealed that the average error in the predicted location of the central residue is between 2.1 and 2.8 residues, for all five tested methods (TOPPRED2, 2.8; ORIENTM, 2.6; HMMTOP, 2.2; TMHMM 2.7; MEMSAT, 2.1). Optimal consensus results were obtained by calculating the average location of the centers predicted with each of the five methods. This reduced the average error to 2.0 residues, corresponding to an average error of 3 Å, based on the translation of 1.5 Å per residue in an
-helix.
Prediction of the interior and exposed faces
A method for predicting interior and exterior surfaces has been described in detail in a recent publication (Beuming and Weinstein, 2004). Briefly, the bias of certain residue types to be preferentially buried or exposed in MPs (expressed as the surface propensity, SP) is combined with a conservation criterion that is based on the observations that conserved residues are mostly found in the interior of MPs. The prediction method involves calculation of the conservation index (CI) and the average value of the SP (SPav) for each position in a multiple sequence alignment and the definition of the interior facing probability (Pinside) as the average value of the CI and (1 SPav) (Beuming and Weinstein, 2004
).
The definition of a face (e.g. interior face) is based on the partition of the orientations in an -helix according to Figure 1. The seven possible faces on the helix correspond to the number of residues per turn. If the 7 residues are labeled AG, then seven (overlapping) faces can be defined as AEB, EBF, BFC, FCG, CGD, GDA and DAE. In a 21 residues long helix, there are three positions A, three positions B, etc., and each face is composed of nine residues. The most probable interior face is taken as the face with the highest average Pinside for the nine positions.
|
Bundles of ideal helices (25 residues long), representing the TM segments of these MPs, were assembled using the distance restraints obtained as described above. The predicted centers of the TMs were aligned and the helices were rotated around their axis to align the face with the highest interior probability to coincide with the inward-facing bisector of the angle formed between the center of a TM and those of its two closest neighbors. In the case of completely buried TMs, the interior face was oriented towards the center of the bundle. Large clashes between residues were removed by modifying torsion angles of side chains or by small manual translations and rotations of the complete TM domains. The structures were further relaxed by a brief minimization (1000 steps), using the Powell algorithm, as implemented in the SYBYL package (SYBYL 6.9; Tripos, St Louis, MO).
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The model structures of rhodopsin, bacteriorhodopsin and halorhodopsin obtained with the protocol outlined in Materials and methods (using restraints from the EM maps), reproduced their crystallographic structures with 3.0, 3.9 and 3.1 Å r.m.s.d., respectively. Using the artificial restraints, derived from the crystal structures as described in the section Restraints, the protocol was applied to the glycerol-3-phosphate transporter (GlpT), the lactose permease (LacY) and the multidrug efflux transporter ACRB. Using these parameters, the crystallographic structures were reproduced within 3.7, 4.7 and 4.1 Å r.m.s.d., respectively. Details of the modeling accuracy are summarized in Table I. The models with the highest (rhodopsin) and lowest (LacY) accuracy, superimposed on the crystal structures, are shown in Figure 2.
|
Errors in topology predictions, in the prediction of interior and lipid-exposed faces and the inaccurate location of the helical axis in the TM bundle can all contribute to the aggregate error of the model. The error due to topology predictions is termed Z-shift, the error in prediction of faces is termed XY-rotation and the inaccurate location of the helical axis in the bundle is termed XY-shift. Values calculated for these individual contributions to prediction errors are listed in Table I for the six test cases.
Despite the high accuracy of current topology prediction methods, there is still a substantial Z-shift for most TMs in the models (average 2.2 Å for all TMs). There is a very large Z-shift for a few TMs (four TMs >6 Å) and these outliers have a strong effect on the overall r.m.s.d.. For instance, the ACRB transporter has 4.0 Å r.m.s.d. to the crystal structure, but this value is reduced to 3.3 Å if TM12 (which has a Z-shift of 8.3 Å) is ignored. The dependence of the error of the models on the magnitude of the Z-shift is evidenced by the high correlation between Z-shift and r.m.s.d. values for individual TMs (see Figure 3). Correspondingly, removing Z-shift of all the TMs in all the models lowers the average r.m.s.d. for all TM domains from 3.6 to 2.8 Å.
|
The prediction of interior and lipid-exposed faces is accurate only to 7080% (Beuming and Weinstein, 2004), but provides a satisfactory orientation of most helices in the XY-plane. In Figure 2, the predicted interior faces in the models of rhodopsin and LacY (blue/purple) and the corresponding faces in the structures (yellow/orange) are identified as closed spheres. For these models, all TMs have good overlap between the observed and predicted faces, except for TM4 in LacY, where the predicted interior face is in fact located at the lipid interface. In contrast to the effect of Z-shift on the r.m.s.d. values, the correlation between the XY-rotation and the overall accuracy is low (see Figure 3) and removing the XY-rotation manually from the model reduces the average r.m.s.d. only slightly, from 3.6 to 3.4 Å.
The XY-shift can result from the inaccurate extraction of the set of distances from the EM maps or from deviation of the TM from ideality (i.e. structural distortions in the helix). The average values of the XY-shift for the models generated from the EM data derived restraints (Table Ia) or from the crystal structure derived restraints (Table Ib) are similar (1.3 vs 1.1 Å), indicating that inaccuracy in the distance restraints does not contribute much to the error. However, the difference in XY-shifts between distorted helices and ideal helices is large. For instance, a comparison of the structures of homologs LacY and GlpT reveals that most of the TMs of LacY are unusually distorted, whereas the helices of GlpT are relatively ideal and this difference is reflected in higher XY-shifts for LacY (1.6 Å) than for GlpT (1.0 Å).
Taken together, the results of the present test show that the combination of the three-step protocol with an EM map generates models of MPs with an average of 3.6 Å r.m.s.d. from the crystal structure. An obvious limitation of the model produced by this protocol remains the modeling of TM domains as ideal -helices. The functional significance of distortions in TM helices is well documented (e.g., see Sansom and Weinstein, 2000
) and meaningful modeling studies must incorporate information about distortions. To test the impact of neglecting the deviations from ideality on the modeling accuracy, the exact distortions of the TM domains as observed in the crystal structures were introduced into the models. Surprisingly, including these distorted helices lowered the overall r.m.s.d. only slightly (from 3.6 to 3.5 Å). However, for a number of helices there was a significant effect of introducing these distortions on the modeling accuracy. We observed that for the affected helices, the buried and exposed faces had been predicted with much higher accuracy (average XY-rotation value of 20°) than those for which inclusion of distortions had no effect (average XY-rotation value of 40°). Notably, no such difference was observed for Z-shift values. It appears, therefore, that a correct prediction of the buried face of a helix is a strong prerequisite for successful refinement of the model by introducing distortions. A striking example of this effect is the difference between the prediction of TM7 in the structural homologs bacteriorhodopsin and halorhodopsin (see Figure 4). The nature of the TM7 distortion in both proteins is identical (a centrally located pi-bulge). Introducing the distortion in the bacteriorhodopsin model does not improve the model (r.m.s.d. 3.3 Å instead of 3.4 Å for TM7), whereas introducing the same distortion in the halorhodopsin model increases the accuracy substantially (r.m.s.d. 2.0 Å instead of 2.8 Å for TM7). Our observations indicated that this difference can be directly attributed to the incorrect prediction of the interior face of TM7 in bacteriorhodopsin and the correct prediction in halorhodopsin. This is evident in Table Ia and Figure 4; note the correct alignment of buried faces in the model of halorhodopsin compared with the structure (Figure 4c), but their misalignment in bacteriorhodopsin (Figure 4a).
|
Following the procedure outlined in the section Restraints and tested as described above, we generated a model for the TM domains of OxlT based on the OxlT density map (Hirai et al., 2002). The alignment used for the prediction of interior and exterior faces according to Beuming and Weinstein (2004)
consists of 21 sequences, all with a sequence identity >25% to the OxlT target sequence. Figure 5 shows the central 18 residues of each predicted TM on a helical wheel, with the central residue in a black-lined box. To show the details of predicted faces, the five residues in each TM that are closest to its TM neighbor are indicated in the color code of that particular neighboring TM. For instance, the face of TM3 that packs against TM6 is indicated with green boxes and the face of TM3 that packs against TM4 is colored magenta.
|
|
Clearly, the model presented here is in excellent agreement with these experimental observations and should be very useful for obtaining further inferences that can guide the design and interpretation of various experiments in a structural context. We expect the quantitative accuracy of the model to be revealed by a direct comparison to a high-resolution structure, when it becomes available.
Conclusions
The three-step protocol presented here, in combination with available structural data from EM experiments, was shown to produce accurate atomic models of MPs. As a validation of the approach, the TM domains of six MPs were modeled and shown to be within 3.04.7 Å r.m.s.d. of their crystal structures.
Analysis of the source of errors implicated incorrect orientation along the membrane normal (Z-shift), as well as incorrect prediction of interior and surface exposed faces (XY-rotation), a displacement of the helical axis (XY-shift) and the neglect of structural distortions within TMs. Z-shift is the source of relatively large prediction errors for all test cases, whereas errors due to XY-rotation or neglect of distortions were smaller. However, a correct prediction of the interior and surface exposed faces appears to be a prerequisite for successful refinement of the model by including distorted helices. Thus, improvements of different aspects of the model were found to have a synergistic effect on the overall accuracy and are not merely additive. To allow further improvement of the results from such a modeling protocol, it will therefore be necessary to improve specific component predictions. This can be achieved by utilizing additional structural constraints as they become available, e.g. new data about specific residueresidue interactions. Notably, we showed that when the same three-step protocol was used to model the TM domains of OxlT, for which there currently is no structure available at atomic resolution, the results were most encouraging because of the remarkable agreement between the model and all experimental observations about this molecule, including a large number of functional inferences that were not at all considered in the construction of the model. Only the anticipated high-resolution structure of OxlT will allow an unbiased assessment of the performance of the current modeling method through direct structural comparison. To this end, the coordinates of the OxlT model are given as Supplementary data (available at PEDS Online).
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Baldwin,J.M., Schertler,G.F. and Unger,V.M. (1997) J. Mol. Biol., 272, 14464.[CrossRef][ISI][Medline]
Ballesteros,J.A. and Weinstein,H. (1995) Methods Neurosci., 25, 366428.
Bertaccini,E. and Trudell,J.R. (2002) Protein Eng., 15, 443454.[CrossRef][ISI][Medline]
Beuming,T. and Weinstein,H. (2004) Bioinformatics, 20, 18221835.
Breyton,C., Haase,W., Rapoport,T.A., Kuhlbrandt,W. and Collinson,I. (2002) Nature, 418, 662665.[CrossRef][ISI][Medline]
Chen,C.P., Kernytsky,A. and Rost,B. (2002) Protein Sci., 11, 27742791.
Filizola,M., Perez,J.J. and Carteni-Farina,M. (1998) J. Comput-Aid. Mol. Des., 12, 111118.[CrossRef]
Fleishman,S.J., Harrington,S., Friesner,R.A., Honig,B. and Ben-Tal,N. (2004) Biophys. J., 87, 34483459.
Fu,D., Sarker,R.I., Abe,K., Bolton,E. and Maloney,P.C. (2001) J. Biol. Chem., 276, 87538760.
Havelka,W.A., Henderson,R. and Oesterhelt,D. (1995) J. Mol. Biol., 247, 726738.[CrossRef][ISI][Medline]
Henderson,R., Baldwin,J.M., Ceska,T.A., Zemlin,F., Beckmann,E. and Downing,K.H. (1990) J. Mol. Biol., 213, 899929.[ISI][Medline]
Heymann,J.A., Hirai,T., Shi,D. and Subramaniam,S. (2003) J. Struct. Biol., 144, 320326.[CrossRef][ISI][Medline]
Hirai,T. and Subramaniam,S. (2004) Biophys. J., 87, 36003607.
Hirai,T., Heymann,J.A., Shi,D., Sarker,R., Maloney,P.C. and Subramaniam,S. (2002) Nat. Struct. Biol., 9, 597600.[ISI][Medline]
Hirai,T., Heymann,J.A., Maloney,P.C. and Subramaniam,S. (2003) J. Bacteriol., 185, 17121718.
Huang,Y., Lemieux,M.J., Song,J., Auer,M. and Wang,D.N. (2003) Science, 301, 616620.
Ikeda,M., Arai,M., Lao,D.M. and Shimizu,T. (2002) In Silico Biol., 2, 1933.[Medline]
Jones,D.T., Taylor,W.R. and Thornton,J.M. (1994) Biochemistry, 33, 30383049.[CrossRef][ISI][Medline]
Kim,Y.M., Ye,L. and Maloney,P.C. (2001) J. Biol. Chem., 276, 3668136686.
Krogh,A., Larsson,B., von Heijne,G. and Sonnhammer,E.L. (2001) J. Mol. Biol., 305, 567580.[CrossRef][ISI][Medline]
Kyte,J. and Doolittle,R.F. (1982) J. Mol. Biol., 157, 105132.[CrossRef][ISI][Medline]
Liakopoulos,T.D., Pasquier,C. and Hamodrakas,S.J. (2001) Protein Eng., 14, 387390.[CrossRef][ISI][Medline]
Murata,K., Mitsuoka,K., Hirai,T., Walz,T., Agre,P., Heymann,J.B., Engel,A. and Fujiyoshi,Y. (2000) Nature, 407, 599605.[CrossRef][ISI][Medline]
Nilsson,J., Persson,B. and von Heijne,G. (2000) FEBS Lett., 486, 267269.[CrossRef][ISI][Medline]
Palczewski,K. et al. (2000) Science, 289, 739745.
Ren,G., Reddy,V.S., Cheng,A., Melnyk,P. and Mitra,A.K. (2001) Proc. Natl Acad. Sci. USA, 98, 13981403.
Sansom,M.S. and Weinstein,H. (2000) Trends Pharmacol Sci., 21, 445451.[CrossRef][ISI][Medline]
Schmidt-Krey,I., Mitsuoka,K., Hirai,T., Murata,K., Cheng,Y., Fujiyoshi,Y., Morgenstern,R. and Hebert,H. (2000) EMBO J., 19, 63116316.
Sonnhammer,E.L., von Heijne,G. and Krogh,A. (1998) Proc. Int. Conf. Intell. Syst. Mol. Biol., 6, 175182.[Medline]
Subramaniam,S., Hirai,T. and Henderson,R. (2002) Philos. Trans., Ser. A, 360, 859874.[ISI]
Tusnady,G.E. and Simon,I. (2001) Bioinformatics, 17, 849850.
Ubarretxena-Belandia,I., Baldwin,J.M., Schuldiner,S. and Tate,C.G. (2003) EMBO J., 22, 61756181.
Unger,V.M. (2001) Curr. Opin. Struct. Biol., 11, 548554.[CrossRef][ISI][Medline]
Unger,V.M., Hargrave,P.A., Baldwin,J.M. and Schertler,G.F. (1997) Nature, 389, 203206.[CrossRef][ISI][Medline]
Visiers,I., Ballesteros,J.A. and Weinstein,H. (2002) Methods Enzymol., 343, 329371.[ISI][Medline]
von Heijne,G. (1992) J. Mol. Biol., 225, 487494.[CrossRef][ISI][Medline]
Williams,K.A. (2000) Nature, 403, 112115.[CrossRef][ISI][Medline]
Ye,L. and Maloney,P.C. (2002) J. Biol. Chem., 277, 2037220378.
Received July 14, 2004; revised November 22, 2004; accepted February 11, 2005.
Edited by Harold Scheraga
|