(Received for publication, February 21, 1995; and in revised form, June 19, 1995)
From the
We report a novel outer membrane lipoprotein of Escherichia
coli. DNA sequencing between ampC and sugE at
the 94.5 min region of the E. coli chromosome revealed an open
reading frame specifying 177 amino acid residues. Primer extension
analysis demonstrated that the promoter is activated at the transition
between exponential and stationary growth phases under control of the rpoS sigma factor gene, and this was confirmed in vivo by monitoring expression of -galactosidase activity from a lacZ translational fusion. The amino acid sequence exhibited
31% identity with human apolipoprotein D (apoD), which is a component
of plasma high density lipoprotein and belongs to the eukaryotic family
of lipocalins. The bacterial lipocalin (Blc) contained a short deletion
of 7 amino acid residues corresponding to a hydrophobic surface loop
that is thought to facilitate the physical interaction between apoD and
high density lipoprotein. However, Blc exhibited a typical prokaryotic
lipoprotein signal peptide at its amino terminus. Overexpression,
membrane fractionation, and metabolic labeling with
[
H]palmitate demonstrated that Blc is indeed a
globomycin-sensitive outer membrane lipoprotein. Blc represents the
first bacterial member of the family of lipocalins and may serve a
starvation response function in E. coli.
The lipocalin superfamily consists of widely distributed,
primarily extracellular, eukaryotic proteins that bind and transport
small hydrophobic ligands(1) . The molecular structures of four
lipocalins (plasma retinol-binding protein(2) , bilin-binding
protein (3) , insecticyanin(4) , and
-lactalbumin(5) ), revealed a common structural motif that
consists of an eight-stranded antiparallel
-barrel, arranged as
two stacked orthogonal sheets, with a COOH-terminal
-helix.
Despite the common lipocalin fold, only 25-30% amino acid
sequence identity exists between lipocalins of known
structure(6) . The cup-shaped three-dimensional structure of
the lipocalins, which forms a central hydrophobic binding pocket for
the ligand, is also characteristic of the fatty acid-binding proteins;
these represent another recently identified protein family that also
binds small hydrophobic molecules, but which by contrast, contain a
10-stranded antiparallel
-barrel and are almost exclusively
intracellular. Because of their similarities of structure and function,
Flower and co-workers (1) have proposed the classification of
lipocalins and fatty acid-binding proteins into a larger structural
superfamily termed calycins.
Although lipocalins are generally
soluble proteins, apolipoprotein D (apoD) ()was originally
identified as a component of the plasma high density lipoprotein (HDL)
particle, leading to the suggestion that apoD may transport a component
of the lecithin-cholesterol acyltransferase reaction(7) .
Unlike the classical apolipoproteins, which are embedded in the
lipoprotein surface by extended amphipathic
-helical structures,
homology modeling of apoD against the atomic coordinates of bilin
binding protein suggested that apoD associates with the HDL particle by
a hydrophobic surface loop. This modeling study also postulated that a
heme-related compound may be the preferred ligand for apoD(8) .
However, apoD has also been identified as a progesterone- and
pregnenolone-binding protein isolated from breast fluid, suggesting a
role in the transport of steroid hormones in human mammary
tissue(9) . In the cyst fluid of women with gross cystic
disease of the breast, apoD can exceed the concentration found in
plasma by about 1000-fold(10) , and apoD induction by both
retinoic acid (11) and interleukin-1-
(12) has
been demonstrated in human breast cancer cells, suggesting that apoD
may be a marker of hormonal alterations. Additionally, apoD accumulates
in regenerating and remyelinating peripheral nerve, suggesting a role
in lipid transport within extravascular
compartments(13, 14) . Like other members of the
lipocalin superfamily, apoD appears to be able to transport a variety
of ligands in a number of different contexts.
Despite the presence
of lipocalins in a wide range of eukaryotic organisms, no lipocalin has
ever been identified in bacteria(15) . Additionally, the
apolipoprotein components of plasma lipoproteins are unrelated to
bacterial lipoproteins, which are anchored to membranes by a
lipid-modified amino-terminal cysteine residue(16) . In this
report, we describe an outer membrane lipoprotein of Escherichia
coli, which is clearly homologous to apoD. This protein, which we
term Blc (bacterial lipocalin), is encoded by the blc gene at
94.5 min on the E. coli chromosome, immediately downstream of
the ampC -lactamase operon. The blc promoter is
expressed at the onset of stationary growth phase under control of the rpoS sigma factor gene, which directs expression of genes
necessary for adaptation to starvation conditions. Blc is the first
lipocalin identified in a bacterial species and may provide an
evolutionary link between bacterial and plasma lipoproteins.
Figure 1: Nucleotide sequence of the blc gene from E. coli CS520. The sequence of the blc gene from pAmpAC extends over 660 base pairs beginning with the final four codons of the ampC gene and its rho-independent terminator (underlined), followed by the -35 and -10 regions of the blc promoter, the blc transcriptional start (+1) and ribosomal binding (rbs) sites, through the 177 amino acid residues of the Blc open reading frame (boldtype), and ending with the final 9 codons of the convergent sugE gene. The PstI, KpnI, and SmaI restriction enzyme sites are underlined, and the cysteine residue at the lipoprotein cleavage site is outlined.
A single open reading frame was revealed,
which specified 177 amino acid residues (19,853 Da) and exhibited a
consensus prokaryotic lipoprotein cleavage site (16) predicting
a mature protein of 159 residues (18,043 Da). The initiating methionine
codon was separated from the rhoindependent terminator of the ampC gene by 60 base pairs, from which -35 and -10 hexamers
corresponding weakly to an E. coli -70 promoter could be
distinguished. A reasonable ribosome binding sequence was appropriately
positioned upstream of the initiating methionine codon. The open
reading frame converged upon sugE such that the two genes
shared overlapping translational termination codons; no rhoindependent
terminator structure could be discerned between them. A chromosomal
deletion in E. coli MI1443 encompasses the blc locus;
this strain is capable of growth under both aerobic (34) and
anaerobic (35) conditions, indicating that the blc gene is dispensable in E. coli.
While this manuscript was in preparation, the DNA sequence of the E. coli MG1655 chromosomal region from 92.8-0.1 min was submitted from the E. coli genome project under GenBank accession number U14003. The sequence from E. coli MG1655 confirms that determined by us for E. coli CS520, except for a C to G transversion at nucleotide 498; this is a silent mutation in the third position of glycine codon 132. We also determined the sequence of the blc homolog from Citrobacter freundii OS60; this DNA sequence, together with neighboring loci, will be described elsewhere. Pertinent to this study was the identification of an open reading frame specifying 177 amino acid residues and displaying 90% amino acid sequence identity with E. coli Blc.
Figure 2:
Transcriptional mapping and rpoS control of the blc promoter. The deduced sequence of the blc promoter region derived from the sequencing lanes labeled G, A, T, and C is shown vertically on the left. The primer extension
reactions were performed with total cellular RNA isolated from E.
coli strains MC4100 (rpoS) and RH90
(MC4100, rpoS359:: Tn10). RNA samples were isolated
at mid-exponential (1), late exponential (2), early
stationary (3), and stationary (4) growth phases,
which were reached after 200, 270, 340, and 540 min of growth,
respectively. The position of the cDNA corresponding to the blc transcription start site (+1) is marked by an arrowhead.
E. coli MC4100 was
recently reported to be distinct from many other common strains of E. coli by virtue of having a functional allele of rpoS, which controls a program of gene expression induced
under starvation conditions and at the onset of stationary
phase(36, 37) . In order to determine if the
accumulation of the blc transcript in stationary phase was
determined by rpoS, we also performed primer extension
analysis using E. coli RH90 (MC4100 rpoS359::Tn10). The blc transcript was not
detected in E. coli RH90 under all observed growth phases (Fig. 2), indicating that the blc gene belongs to the
stationary phase regulon controlled by rpoS, and suggesting
that the Blc protein may serve a function that contributes to the
adaptation of cells to starvation conditions. Although there is no
clearly defined consensus sequence for rpoS-dependent
promoters, they are generally similar in structure to those controlled
by -70(38, 39) .
Figure 3: Overexpression and membrane localization of the Blc protein. 12% SDS-PAGE analysis of E. coli MC4100 transformed with either pBlcEH, expressing blc (+), or pBlcHE, unable to express blc(-). Cells were induced with IPTG and subjected to French pressure lysis. The lysates (L) were divided into soluble (S) and membrane (M) fractions by ultracentrifugation. The position of bands present during Blc expression are marked by arrowheads. Each lane corresponds to 40 µg of protein stained with Coomassie Blue dye. The indicated molecular mass standards are expressed in kDa.
Figure 4: Membrane fractionation of precursor and mature species of Blc. 15% SDS-PAGE analysis of E. coli MC4100 transformed with either pBlcEH, expressing blc (+), or pBlcHE, unable to express blc(-). Cells were induced with IPTG and subjected to French pressure lysis. Membranes were isolated by ultracentrifugation and separated into light (L), medium (M), and heavy (H) fractions by sucrose density gradient centrifugation. The pBlcEH fractions shown at right are aligned to show the precursor and mature forms of Blc. The position of bands present during Blc expression are marked by arrowheads. Each lane corresponds to 40 µg of protein stained with Coomassie Blue dye. The indicated molecular mass standards are expressed in kDa.
Figure 5:
Metabolic labeling of Blc with
[H]palmitate. 15% SDS-PAGE analysis of E. coli MC4100 transformed with either pBlcEH, expressing blc (+), or pBlcHE, unable to express blc(-). Cells were induced with IPTG in the presence of
[
H]palmitate, and globomycin was added when
indicated (+). Proteins were prepared by boiling the cells in SDS,
and samples corresponding to 800,000 cpm were resolved by 15% SDS-PAGE
and visualized by fluorography. The position of bands present during
Blc expression are marked by arrowheads. The indicated
molecular mass standards are expressed in
kDa.
Figure 6:
Homologies between apolipoprotein D and
bacterial lipocalins. The bacterial lipocalins are aligned with three
apoD species; see Table 2for the source of each sequence.
Residues are shaded when at least 5 of the 6 aligned residues
are identical. The positions of cysteine residues forming the two
disulfide bonds in the apoD species are underlined in boldtype and one pair distinguished by italics.
Additional cysteine residues are shown in boldtype and include the positions of the lipoprotein cleavage site near
the amino terminus of the Vlp and Blc species; the amino-terminal
glutamine residues of the mature apoD species are also shown in boldtype. The homologies were initially identified
using the TFASTA algorithm (Genetics Computer Group) and optimized
manually by comparison with the secondary structural elements of human
apoD, which are shown under the alignment as -strands (b)
and
-helix (h).
We have identified a novel bacterial lipoprotein (Blc), which
exhibits homology with a eukaryotic lipocalin (apoD). Blc is optimally
expressed in stationary phase and is under the control of the RpoS
-factor global regulator. In the natural environment, bacteria
spend the majority of their existence in the stationary phase and the
expression of genes during stationary phase is of considerable
interest. Bacteria have developed sophisticated mechanisms to survive
starvation for prolonged periods of time, and it has been proposed that
proteins synthesized at the late stages of growth are important for the
survival of the organism(41, 42) . It is now apparent
that several E. coli lipoproteins are expressed under
stationary phase conditions. A starvation-inducible lipoprotein (Slp)
was recently shown to be expressed in stationary phase cultures
independently of rpoS(43) , whereas stationary phase
expression of the OsmB lipoprotein was shown to require rpoS(44) . The rpoS-dependent expression of
the blc gene suggests that Blc, like the other stationary
phase lipoproteins, may serve an important starvation response function
in E. coli.
A Blc-like protein has been found in E. coli, C. freundii, and V. cholerae and it presumably exists in other Enterobacteriaceae. Although an E. coli strain deleted for the blc gene grows normally in the laboratory, blc may offer some advantage in the pathogenic or natural environment. Based on the hydrophobic ligand binding capacity of apoD, it seems likely that Blc also functions to bind a hydrophobic ligand. The Blc mRNA is very poorly expressed, suggesting that the normal level of Blc in the outer membrane is low and that Blc is not simply sequestering hydrophobic ligands. Most outer membrane lipoproteins are oriented toward the periplasm and we presume that Blc serves to capture its hydrophobic ligand within the periplasm, although we have not eliminated the possibility that the protein serves as an external receptor molecule. Studies are in progress to characterize the orientation of Blc.
The lipocalin proteins are
composed of a common structural motif that consists of an
eight-stranded antiparallel -barrel, arranged as two stacked
orthogonal sheets, with a COOH-terminal
-helix. A large proportion
of lipocalin residues are exposed to the hydrophobic interior and
reside in the central ligand binding cavity. It has been suggested that
different lipocalins change these residues to accommodate different
ligand specificities, thus explaining the rather low levels of primary
structural homology despite the common lipocalin fold(6) . In
this context, the 26-33% amino acid sequence identity that exists
between the three species of apoD and the three bacterial lipocalins
must be regarded as highly significant ( Table 2and Fig. 6). Molecular modeling of apoD against the atomic
coordinates of bilin-binding protein(8) , which together share
only 27% amino acid sequence identity, predicted a disulfide bonding
pattern in apoD that has since been confirmed
biochemically(45) . Additionally, the apoD model led to the
identification of a hydrophobic surface loop between
-strands 7
and 8, which was postulated to mediate the physical interaction between
apoD and HDL (Fig. 7). It has since been shown that a lone
cysteine residue located immediately adjacent to the hydrophobic loop
in human apoD forms an intermolecular disulfide bond with the
HDL-associated apolipoprotein AII, providing strong support for the
proposed interaction(45) . Interestingly, the aforementioned
cysteine residue is absent in the rodent apoD species (Fig. 6),
which is consistent with the observation that rat apoD is not found to
a significant extent on plasma lipoproteins(13) .
Figure 7:
Molecular model of human apolipoprotein D.
Molecular model of human apoD emphasizing the eight antiparallel
-strands, which form two orthogonal sheets, followed by a
COOH-terminal
-helix. The hydrophobic surface loop that separates
-strands 7 and 8 is marked with an arrowhead. This
MOLSCRIPT diagram was generated by Natalie Strynadka using the
coordinates of human apoD obtained from the Brookhaven Protein Data
Bank under accession number 2APD.
A 7-amino
acid deletion in the bacterial lipocalins appears to have eliminated
the corresponding hydrophobic surface loop found in the apoD species (Fig. 6). Additionally, a glycine and a proline residue are
suitably positioned at the leading edge of the deleted loop to
accommodate a -hairpin turn. A lone cysteine preceding this
putative turn structure is evident in the E. coli and C.
freundii Blc proteins, but not in the V. cholerae Blc,
potentially implicating a species-specific disulfide between Blc and
itself or another component of the bacterial cell envelope. Two
intramolecular disulfide bonds in the apoD species are formed by four
additional cysteine residues, which are absent in the three bacterial
lipocalins. The only other cysteine residue in the Blc proteins is
located at the lipoprotein processing site near the amino terminus. The in vivo localization of E. coli Blc in the outer
membrane is consistent with the presence of a serine residue at the
+2 position of the mature protein(46) .
As Blc represents the first lipocalin to be characterized from a bacterium, it raises important questions regarding the origin of this burgeoning class of proteins. Blc is unique among bacterial lipoproteins in that it is a lipocalin, whereas apoD is unique among lipocalins in that it is a plasma lipoprotein. Did the primordial lipocalin originate in bacteria, where it served a function in the Gram-negative cell envelope before it was acquired by eukaryotes and adapted to a number of functions in multicellular organisms, or did the lipocalins originate firstly in eukaryotes from where they were adapted, at least in the case of apoD, to function in the Gram-negative cell envelope? What should be clear from this study is that the function of apoD either originates from or has been adapted toward a basic function in the cell envelope of E. coli. In this regard, it is remarkable that Blc and apoD are both found anchored in asymmetric bilayers; the inner leaflet of the Gram-negative outer membrane and the plasma exposed surface of mammalian lipoproteins are two rare examples of non-phospholipid bilayer membranes in biology. Perhaps the still undefined physiological functions of Blc and apoD may prove to be more closely related than otherwise anticipated. The rpoS-dependent activation of the blc gene at the onset of stationary phase suggests that the Blc protein may somehow serve in adaptation to starvation conditions. We hope that further analysis of Blc structure and function will be relevant to apoD and its pathology.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U21726 [GenBank]and U21727[GenBank].