(Received for publication, August 15, 1995; and in revised form, November 11, 1995)
From the
The membrane topology of the human Na/glucose
cotransporter SGLT1 has been probed using N-glycosylation
scanning mutants and nested truncations. Functional analysis proved
essential for establishment of signal-anchor topology. The resultant
model diverges significantly from previously held suppositions of
structure based primarily on hydropathy analysis. SGLT1 incorporates 14
membrane spans. The N terminus resides extracellularly, and two
hydrophobic regions form newly recognized membrane spans 4 and 12; the
large charged domain near the C terminus is cytoplasmic. This model was
evaluated further using two advanced empirically-based algorithms
predictive of transmembrane helices. Helix ends were predicted using
thermodynamically-based algorithms known to predict x-ray
crystallographically determined transmembrane helix ends. Several
considerations suggest the hydrophobic C terminus forms a 14th
transmembrane helix, differentiating the eukaryotic members of the
SGLT1 family from bacterial homologues. Our data inferentially indicate
that these bacterial homologues incorporate 13 spans, with an
extracellular N terminus. The model of SGLT1 secondary structure and
the predicted helix ends signify information prerequisite for the
rational design of further experiments on structure/function
relationships.
SGLT1 is one of a family of homologous
Na-dependent membrane transport proteins currently
comprising six eukaryotic and two bacterial homologues and several
bacterial genomic open reading frames(1, 2) . This
family has been assigned membership in a larger superfamily of
(nonhomologous) families sharing a common general function, the
sodium-coupled transport of solutes, and a common predicted structure
of 12 membrane spans(2) . The secondary and tertiary structures
that underlie the efficient and specific transmembrane shuttling of
substrates are wholly unknown.
One step in transporter structural analysis is the determination of membrane topology of hydrophilic and hydrophobic regions. Knowledge of structural domain topology is a prerequisite for the rational design of experiments intended to localize sodium and substrate binding sites. Plots of hydrophobicity (3, 4) provide good indications of transmembrane regions but are subject to interpretation; e.g. analyses of the Escherichia coli lac permease predict between 8 and 14 membrane spans(5) .
SGLT1 was originally proposed to form 11 transmembrane domains(6) . Subsequent models have deleted or added spans(7, 8, 9) . We report the physical determination of SGLT1 membrane topology by N-glycosylation scanning mutagenesis, a method used to map several membrane proteins, including hydroxymethylglutaryl-CoA reductase(10) , cystic fibrosis transmembrane conductance regulator (11) , and the Glut 1 facilitative glucose transporter(12) .
The nested N-glycosylation sequence NNSS was inserted into wild-type SGLT1 at strategic locations, co-introducing an XhoI restriction site (NNSS mutants, see Fig. 1and Fig. 2). Insertion sites were chosen such that flanking amino acids were among those found at N-glycosylated sites(15) . Site-directed mutagenesis used a variation (16) of the Taq DNA polymerase-mediated megaprimer method(17) . Mutant clones were screened for Taq-introduced errors by dideoxy sequencing (Sequenase 2.0 kit, U. S. Biochemical Corp.), and NNSS-bearing restriction fragments were swapped into the N248Q mutant, yielding the NNSS mutant series.
Figure 1: N-Glycosylation mutant scanning of the secondary structure model (9) of human SGLT1. N-glycosylation consensus sequences were inserted at the points numbered 1 to 12 in a mutant (N248Q) of SGLT1 that lacked the native N-glycosylation site. Insertion sites were chosen to reside between two putative membrane spans or between a span and one of two hydrophobic regions labeled H1 and H2.
Figure 2: Insertion sites of two sets of N-glycosylation scanning mutants of SGLT1. A, the residues flanking the 12 NNSS insertions generated by site-directed mutagenesis of the mutant N248Q (human SGLT1, which lacks the native N-glycosylation site) are listed. The 12 NG mutants were made from the NNSS mutants by splicing the 42-amino acid hydrophilic loop of SGLT1 bearing the native N-glycosylation consensus into the introduced XhoI site. B, the DNA sequence of the two nested N-glycosylation consensus sequences of the NNSS mutant series, and the 42-amino acid sequence of the NG mutant series as it appears after splicing into the XhoI site of the NNSS mutants. The introduced XhoI site is underlined in the NNSS insert; two XhoI sites flanking the NG insert are shown. The NG mutants also retained amino acids (underlined) introduced in the parent NNSS mutant. The resultant NG mutants had N-glycosylation consensus sequences in the two regions highlighted in boldface.
Native glycosylation (NG) ()mutants bearing a transposed
copy of the natively N-glycosylated loop of SGLT1 (see Fig. 2) were constructed by splicing into the XhoI site
of each NNSS mutant. This 42 amino acid-encoding loop of human SGLT1
was synthesized by polymerase chain reaction, incorporating flanking XhoI sites. The XhoI fragment was ligated into each
NNSS mutant, and transformants were screened for orientation and
sequence fidelity.
Figure 4:
Experiments elucidating the topology of
the N-terminal region. A, immunostained Western blots of
detergent extracts, either treated (+) or not treated(-)
with the glycopeptidase PNGase F, of Xenopus oocytes
expressing native SGLT1, the nonglycosylated mutant N248Q, or the N-glycosylation mutant NNSS13 (NNSS sequence after Asp2). B, schematic representation of anticipated peptides resulting
from in vitro translation of mRNA synthesized from native
SGLT1 cDNA templates cleaved with one of three indicated restriction
enzymes. C, autoradiogram of SDS-PAGE of peptides synthesized
in the presence of [S]methionine in vitro from progressively truncated mRNAs of native SGLT1 cDNA. mRNAs
were synthesized from native SGLT1 cDNA templates cleaved with BamHI, BsmI, or MscI (lanes
1-3, respectively). Microsomes were pelleted after protein
translation and directly applied to the gel (lanes a), or were
washed at high pH to remove secreted or adsorbed translation products. Lanes b and c represent pelleted membrane sheets, or
soluble protein in the alkaline supernatant,
respectively.
Previous experiments demonstrated that only the first (Asn-248) of two N-glycosylation consensus sequences found in native SGLT1 is glycosylated(13) , indicating their extra- and intracellular dispositions. To extend these findings, insertion sites in human SGLT1 for introduction of N-glycosylation consensuses (NX(T/S)) were selected between putative membrane spans (9) and two other hydrophobic domains (Fig. 1, H1 and H2).
NNSS mutants were expressed in Xenopus oocytes. Glycosylation status was assessed by mobility comparisons on immunostained Western blots, of peptidyl N-glycanase F-treated and nontreated oocyte detergent extracts. Despite measures to maximize consensus recognition, only NNSS3 and NNSS7 were glycosylated (Fig. 3A; NNSS7 not shown). Glycosylation of NNSS7 verified this domain's putative extracellular location, whereas NNSS3 glycosylation contradicted this consensus' putative intracellular disposition. Glycosylation profiles of NNSS mutants expressed in vitro in reticulocyte lysate with pancreatic microsomes (not shown) paralleled those obtained from oocytes. Presumably, steric factors such as membrane proximity prevented consensus recognition and glycosylation of most NNSS consensuses presented to the endoplasmic reticulum lumen.
Figure 3:
Immunostained Western blots of detergent
extracts of Xenopus oocytes expressing N-glycosylation scanning mutants. A, oocytes
expressing native SGLT1, the nonglycosylated mutant N248Q, or one of
four NNSS N-glycosylation scanning mutants. NNSS mutants
lacked the wild-type N-glycosylation site. The extracts were
either treated (+) or not treated(-) with the glycopeptidase
PNGase F. Arrows denote N-glycosylated species. B, oocytes expressing 12 NG N-glycosylation scanning
mutants. NG mutants lacked the wild-type N-glycosylation site
and were derived from NNSS mutants by ligation of the natively
glycosylated 42-amino acid external loop of SGLT1 into the XhoI site at the NNSS insertion, increasing the mutant's M by 5.5 kDa above that of N248Q. N-I indicates noninjected oocytes.
Immunoblots of oocyte-expressed NG mutants, ± PNGase F treatment, are shown in Fig. 3B. Results were as follows: (i) the domains glycosylated in the NNSS3 and NNSS7 mutants were also glycosylated in their NG mutant counterparts (NG3 and NG7); (ii) NG2 was also glycosylated, however, suggesting no membrane span between NG2 and NG3 (but see below); (iii) NG5 was glycosylated, and NG4 was not, indicating that the intervening hydrophobic region H1 forms a transmembrane span, consistent with concurrent extracellularity of the NG3 and wild-type glycosylation sites; (iv) NG11 was glycosylated and NG10 was not, indicating the intervening hydrophobic region H2 forms a newly identified membrane span, as predicted in one model(8) ; (v) NG12 was not glycosylated, indicating the large ionized C-terminal region is cytoplasmic; and (vi) the remaining three putatively external hydrophilic regions, topologically mandated the sum of all previous evidence to reside extracellularly, were each glycosylated (NG5, NG9, and NG11). The 48-amino acid insertion of the NG mutants was thus, as intended, sterically recognizable for N-glycosylation when presented to the endoplasmic reticulum lumen.
Mutant NG1 was predominantly nonglycosylated, but unexpectedly showed an N-glycosylated component (Fig. 3B), implying that the N terminus resides extracellularly, inconsistent with the glycosylation of NG2. The topogenic orientation imposed on the N terminus and adjacent hydrophobic signal-anchor of integral membrane proteins is evidently determined by the distribution of positive charge near the signal anchor(30, 31) . Insertion of the large NG sequence (three positive, three negative charges) in mutants NG1 and NG2 near the signal-anchor MS1 possibly perturbed the initial translocation/orientation event.
N-glycosylations of NNSS13, NG2, and NG3 were topologically inconsistent with two membrane spans flanking NG2, suggesting that the 48-amino acid insertion in NG2 had perturbed the membrane topology, or that spans MS1 and MS2 were actually extracellular domains. Therefore, three truncated mRNA species were synthesized (Fig. 4B), from SGLT1 templates cut with MscI, BsmI, or BamHI, yielding peptides of 82, 103, and 141-amino acid, respectively, upon translation in vitro (Fig. 4C). Microsomes of translation reactions were pelleted and resuspended, and aliquots were washed with pH 11.5 carbonate to open microsomes and remove secreted and adsorbed translation products(20) . All three truncated peptides remained associated with the pelleted membrane fraction after alkaline washing, although a fraction of the smallest peptide (MscI), bearing two putative membrane spans, appeared in the supernatant. Significantly, the next truncation (BsmI) was fully retained in the membrane sheets, although no additional putative membrane spans were translated (cf. Fig. 4B). We conclude that the domain probed by NG2 resides intracellularly in native SGLT1, and that the 48-amino acid insertion in NG2 had perturbed translocation of the first transmembrane span, inverting local topology.
Fig. 5B schematizes derivation of a corrected secondary structure model from that in Fig. 1.
Figure 5:
Summary and alignment of results of N-glycosylation mutant scanning of SGLT1, predictions of
transmembrane helices by the neural network PredictProtein (25) and the program MEMSAT(26) , and the results of a
hydrophobicity/reverse-turn analysis of the transmembrane helix ends.
The N-glycosylation status, after expression in Xenopus oocytes, of the mutants NNSS1-NNSS12 and NG1-NG12 are indicated in
the paired boxes, with the NG status appearing above the NNSS
status. Mutants that were N-glycosylated are indicated by a
+. The two natively-occurring N-glycosylation consensus
sequences and their natural N-glycosylation status are
indicated by the two triangles. Membrane spans predicted by
the PredictProtein neural network are indicated by the hatched
bars. The spans indicated by the upper hatched bars were
predicted from the sequence alignment S6 of one isoform each of the six
eukaryotic SGLT1 homologues, including human SGLT1, pSGLT2 (39) , the rabbit Na/nucleoside
cotransporter(40) , the human Na
/myoinositol
cotransporter(41) , and two sequences of unknown function, RK-D (42) and ST1(43) . The spans indicated by the lower
hatched bars were predicted from a second sequence alignment S6B5,
which includes the six eukaryotic homologues and five bacterial
homologues, the Na
/proline cotransporter(38) ,
the Na
/pantothenate cotransporter(44) , and
three hypothetical proteins known from genomic sequencing, with
SwissProt numbers P32705 and P31448 and GenPept number X86084. MEMSAT
predictions are shown for SGLT1 and the Na
/myoinositol
cotransporter SMIT1 (upper and lower
gradient-filled bars, respectively) using the following
parameters: minloop, 2 amino acids; minhelix, 19 amino acids; maxhelix,
31 amino acids; and minhelixscore = -100. MEMSAT
predictions of helix orientations are indicated by the gray-to-white (intracellular-to-extracellular) shading. Black bars show predictions of SGLT1 transmembrane helix ends (methodologies
of White and Jacobs(27) , applied as outlined in Fig. 6).
Figure 6:
IFH and RT propensity analysis of human
SGLT1. A, the IFH averages of a scanning 19-amino acid window
are shown for three assumptions of side chain-side chain satisfaction
of hydrogen bonds ranging from none (h = 0) to complete (h = 1). Helix centers of the transmembrane spans
(MS1-MS13) correspond to the amino acid residue at the peak in a
broad hydrophobic peak and are indicated by filled
cross-marks 19 amino acids wide. Intracellular polar
interhelix regions are indicated by a star. B, the RT
propensity averages of a scanning window 19 amino acids wide. Helix
centers of the transmembrane spans correspond to the amino acid of the
minimum value in a broad RT minimum, and are indicated by unfilled
cross-marks 19 amino acids wide. C, the IFH
averages of a scanning trapezoidal window 5 amino acids wide are shown
for h = 0, 0.5, and 1. The 5-amino acid window averages
were iteratively scan-averaged using a 3-amino acid window, thereby
trapezoidally weighting the central amino acid values of the resulting
7-amino acid window to smooth the curve. The window's nominal
width at half-height remains 5 amino acids (see (27) ). Helix
centers from (A) above are superimposed. Helix end zones are
indicated by the gray boxes and were determined from the h = 0.5 curve. They are defined to lie between the edgemost
secondary peak on a broad hydrophobic peak and the adjacent secondary
peak. End zones that overlap completely are indicated by black
boxes. D, RT averages of a scanning
trapezoidal window 5 amino acids wide are shown. Superimposed are the
helix centers (cross-marks) localized in A and B above, and the helix end zones localized in (C). The RT
peak falling within each end zone was
used to estimate the helix end, and is marked by a short vertical
line.
10 of 13 NNSS mutants were devoid of cotransport activity. Only NNSS13, NNSS1, and NNSS11 retained activity levels of 100, 13, and 7% that of native SGLT1, respectively. It is interesting that a 4-amino acid insertion (2 amino acid in NNSS12) usually eliminated function. In contrast, four of four cystic fibrosis transmembrane conductance regulator scanning mutants (11) bearing small insertions were fully functional in mammalian cells. 10 of 15 glucose carrier Glut 1 mutants (12) bearing a 41-amino acid insertion retained 2.5-30% deoxyglucose transport activity in oocytes. SGLT1 sensitivity to even small insertions likely reflects complex structural constraints imposed by the requirements of transport-coupling three solutes.
Application of MEMSAT individually to six eukaryotic SGLT1 homologues (Fig. 5A) gave the highest scores to 13-, 14-, and 15-span topologies. The 13- and 15-span models were ruled out because they predicted an intracellular N terminus. Gradient-filled bars of Fig. 5A indicate the 14 transmembrane helices and orientations predicted by MEMSAT for human SGLT1 and, positioned below, those for the eukaryotic homologue of least identity, SMIT1. Predicted and experimental transmembrane orientations accord.
Experimental results and computer predictions agreed, except for MS2, which PredictProtein failed to predict (Fig. 5A). The two computer algorithms provided complementary information. The PredictProtein neural network utilized the evolutionary information content of multiple sequence alignments without assumptions of helix length, hydrophobicity, etc. MEMSAT's five-state scoring basis permitted predictions of transmembrane helical orientations.
The SGLT1
transmembrane helix centers were estimated from a plot 19-amino
acid-wide sliding window averages of the amino acid interfacial
hydrophobicities, IFH(h) of Jacobs and White (28) ,
where h represents the extent (range of 0-1) to which
side chains capable of forming hydrogen bonds do so with one another
rather than water (IFH plots resemble other hydrophobicity plots).
Helix centers correspond to the peak amino acid within each broad
hydrophobic peak. The IFH plot for h = 0.5 was used,
because 50% of side chain potential H-bonds in the hydrophobic
interiors of globular proteins are satisfied by side-chain/side-chain
interactions (27, 32) . Fig. 6A shows human
SGLT1 hydrophobicity plots of IFH(h)
for h = 0, 0.5, and 1. Helix centers are marked by solid
cross-marks 19 amino acids wide. (Broad IFH peaks corresponding to
MS4-MS6 were more clearly indicated using an 11-amino acid window.)
Fig. 6B shows a plot of average RT propensity using a 19-amino acid window and the RT preferences from Table V of Levitt(29) . The plot inversely mirrors that of IFH hydrophobicity; broad valleys correspond to membrane spans, the amino acid at each valley's lowest point also indicates the helix center (hollow cross-marks, Fig. 6B). The correspondence between helix centers identified from the IFH(0.5) and RT plots (Table 1) was good, differing by 8 amino acids or less (average 3.5 amino acids).
Span MS2 is very unusual. Its helix
center was not apparent, since MS2 lacks a broad IFH peak
or RT
valley. The other eukaryotic SGLT1 homologues also
lack clear IFH and RT indications of MS2. The presence of MS2 is,
however, apparent in IFH and RT plots of the bacterial
Na
/pantothenate cotransporter (SwissProt P16256) and
hypothetical 62-kDa protein (SwissProt P31448) (not shown). The
IFH
maxima and RT
minima of MS2 for both
bacterial homologues all occur at the same amino acid position in
aligned sequences. This position aligns with Asn-78 of SGLT1. The MS2
helix center of SGLT1, calculated as the midpoint between the helix
ends (see below), corresponds to Ala-76, 2 amino acids upstream of
Asn-78. This near coincidence of apparent helix centers restores some
credibility to the MS2 helix end determination described below.
Helix ``end zones'' were next determined for each helix
center indicated by the IFH(0.5) plot. Helix centers from
IFH
and RT
plots were useful for determining
the helix end zones of closely spaced helices. Averages of
IFH(h) were replotted (Fig. 6C) using a
centrally weighted trapezoidal window of 5 amino acids(27) .
Transmembrane helices appear as broad hydrophobic peaks with smaller
secondary peaks superimposed on, and flanking, them. Each PSRC helix
end fell between an edgemost secondary peak of the main IFH(0.5)
peak and an adjacent secondary flanking peak. SGLT1 helix
``end zones'' are indicated by the gray boxes in Fig. 6C and are listed in Table 1.
Each
experimentally determined PSRC helix end was reliably indicated by an
amino acid cluster with high RT propensity falling within the helix end
zone. Fig. 6D shows a plot of the averaged RT
propensities determined with a centrally weighted trapezoidal 5-amino
acid window. Helix centers and end zones previously determined are
superimposed above the RT plot in Fig. 6D. An
RT
maximum falls within each end zone or just outside it.
In PSRC, the RT
maxima fall on average one amino acid
before N ends and three amino acids after C ends. SGLT1 end zones,
which did not overlap an RT
maximum, were nonetheless
associated with one nearby that, after the +1 and -3 rules
were applied, assigned the helix end to a point within the end zone (Table 1). Two exceptions were the helix C ends of MS3 and MS12,
which lacked associated RT
maxima; these ends were instead
defined by the helix end zone edge closest to a nearly appropriate
RT
maximum(27) . Helix midpoints defined by two
ends were calculated (Table 1), comparing well with the helix
centers indicated by the extrema of the IFH
and RT
plots.
Clearly, the SGLT1 helix end predictions do not obviate the need for further physical experiments to refine our understanding of SGLT1 secondary structure, nor should overemphasis be placed on their presumed accuracy. Rather, the model that the helix ends predict can be used wisely to generate testable hypotheses of structure and function.
The IFH(h) plots of the PSRC L subunit reflect the topological orientation of the extra- and intracellular interhelical domains; the valleys of the IFH(h) plots representing the extracellular domains were characteristically shallower and more h sensitive(27) . This was interpreted to indicate that the interhelical domains that must cross the membrane during topogenesis could decrease the unfavorable free energy change during translocation by (i) replacing water H-bonded to side chains with inter-side chain H-bonds (h sensitivity) and (ii) by being less polar overall than cytoplasmic domains. The IFH(h) plots of SGLT1 show a general trend for the intracellular domains, indicated by asterisks in Fig. 6A, to correspond to the deepest valleys, but exceptions are apparent. No consistent difference was apparent between extra- and intracellular domain h-sensitivity. This disparity in IFH(h) behavior between SGLT1 and the PSRC L subunit may reflect different mechanisms of membrane protein topogenesis utilized by eukaryotes and bacteria (33) .
We note that the helix end
zones of MS7 were difficult to call. The mid to C-terminal portion of
MS7 is extremely h sensitive, more so than any other span. At h = 1, a prominent IFH secondary peak
appears, presenting as only a shallow shoulder at h =
0.5 (Fig. 6C). If in fact these side chain H-bonding
functional groups are co-satisfying in vivo, the predicted C
end zone shifts N-terminally, and a more N-terminally located RT
peak defines the helix C end. Conservation of adequate helix
length then forces the predicted N end to be defined by the more
N-terminally located of the two RT
peaks that fall within
the N end zone. The helix consequently shifts upstream by 6 and 12
amino acids at the N- and C ends, respectively. This may indicate two
quasi-stable states of MS7 that could correlate with a movement of the
MS7 helix normal to the membrane accompanying
Na
/glucose cotransport.
The N end of helix MS7 is adjacent to the highest and widest peak of reverse-turn propensity of any membrane span. This, and the high polarity of the h-sensitive C-terminal portion may explain why the PredictProtein neural network assigned only modest transmembrane helical probability to 10-11 amino acids near the helix center. The MS7 transmembrane span is clearly atypical. The C-terminal portion of the MS7 helix contains the most highly conserved sequence of amino acids (WYWCXDQVIVQR) found in 13 eukaryotic family members. Residues within and flanking MS7 have recently been shown to account for five of the 16 severest SGLT1 missense mutations, found in patients diagnosed with glucose/galactose malabsorption (34) .
The N
end zone of MS11 was similarly problematic. Secondary IFH peaks defining the N end zone, and thus the bracketed RT
peak defining the N end itself, were selected from two
alternatives such that the calculated helix midpoint best corresponded
with helix centers indicated by IFH
and RT
extrema (Table 1).
The revised secondary structure model
of human SGLT1 incorporates topological information from all the
foregoing analyses (Fig. 7). Two intrahelix salt bridges may be
present, one in MS4 (Lys-157:Asp-161) and one in MS12 (Arg-499:Glu-503)
(one helical turn separation per pair). Intrahelix salt bridges would
decrease the unfavorable free energy change upon translocation into the
membrane (35) . The SGLT1 mutants R499H (34) and K157A ()display impaired transport in oocytes. An interhelix salt bridge is suggested between Asp-294 in MS7 and
Lys-321 in MS8, consistent with the short interhelical domain (Fig. 7).
Figure 7:
Secondary structure model of human SGLT1
determined by N-glycosylation mutant scanning and the
incorporating predictions of the transmembrane helix ends. The native N-glycan appears between MS5 and MS6. The probable SGLT1 helix
ends as determined in Fig. 6and tabulated in Table 1have
been incorporated into the model. The hydrocarbon interior is shown 30
Å thick, equivalent in length to a 20-amino acid -helix and
the interfacial layer 9 Å
thick(28) .
MS10 and MS11 show near-mirror image profiles in
the IFH and IFH
plots (Fig. 6, A and C). The extracellularly-oriented halves of their
helices are similarly quite polar and h sensitive, suggesting
close apposition in the bilayer and coordinated movement during
cotransport, consistent with the short 10-amino acid interhelical
domain. Most of the 14 spans show high h dependence, and of
these most also show a high
-hydrophobic moment (not shown),
implying contact with transported solutes and water.
The 31-amino acid region immediately upstream of MS3 deserves special mention. It resides partly extracellularly and comprises most of MS2. Nearly 50% of the residues are either Gly or Ala, the two smallest amino acid species; inclusion of Ser residues accounts for >60%. This intriguing pattern persists in a multiple sequence alignment of 13 known eukaryotic species (not shown). Enrichment in small amino acids of high RT propensity partly explains the lack of an IFH peak or RT valley typical of transmembrane helices, and PredictProtein's failure to identify MS2. The small amino acid size (low hydrophobic effect) and low polarity of the MS2 region could enable MS2 ``slippage'' transverse to the membrane with minimal free energy change. The upstream adjacent FFLAGRS sequence appears to be the domain best conserved between eukaryotic and bacterial homologues. Is the association of the two least well defined atypical transmembrane helices, MS2 and MS7, with the two most well conserved sequence motifs a coincidence?
Major revisions to the secondary structure model (Fig. 5B and 7) include (i) extracellular reassignment
of the N terminus, (ii) two newly recognized membrane spans MS4 and
MS12, (iii) antiparallel reorientation of helices MS1, MS2, MS3, MS13,
and MS14, (iv) estimates of the helix ends, (v) contraction of several
interhelical domains, particularly between MS5/MS6 (to 4 amino acids)
and MS11/MS12 (to 3 amino acids), and (vi) cytoplasmic reassignment of
the extremely polar region proximal to the C terminus, consistent with
its immunohistochemical localization(36) . Antibodies studies
place the C terminus of the E. coli Na/proline transporter putP in the
cytoplasm(37) .
We infer from SGLT1 topology that the
bacterial Na/proline and
Na
/pantothenate transporters incorporate a periplasmic
N terminus and not 12 but 13 transmembrane spans; only one was assigned
in the MS2/MS3 region(38) . Eukaryotic homologues end abruptly
with a string of hydrophobic amino acids, which two algorithms predict
form a 14th transmembrane helix. Nascent strand adsorption to the
endoplasmic reticulum interface would hydrophobically precipitate
backbone H-bonding (
-helix formation; the interface satisfies
60% of the hydrophobic effect) followed by partition into the
hydrocarbon layer (28) ; MS14's low
-hydrophobic
moment obviates a surface helix. The C terminus may either fully span
the membrane with an emergent C-terminal carboxylate, or reside within
the hydrocarbon layer as a helical buoy with depth of insertion
determined by interhelical polar interactions. We are currently
investigating whether MS14 spans the bilayer. Comparison with bacterial
homologues suggests the 14th span evolved by extension of the C
terminus.
In overview, a perusal of simple hydropathy plots directed the strategic placement of N-glycosylation consensuses in SGLT1. Glycosylation profiles of mutants elucidated the membrane topology, which was evaluated computationally. An important caveat bearing upon scanning mutagenesis emerged from this work: insertions near the signal anchor perturbed N-terminal topology, which advanced computational methods failed to correct. Functional expression of one glycosylated mutant contributed indispensably to the establishment of N-terminal extracellularity, augmented by truncation experiments. Knowledge of correct SGLT1 membrane topology and the helix end predictions help focus future experimental designs directed at uncovering substrate binding sites, helix-helix bundling within the membrane, postranslational modifications, and other structure/function relationships.