(Received for publication, December 8, 1994; and in revised form, March 28, 1995)
From the
The cornified cell envelope (CE) is a 15-nm thick layer of
insoluble protein deposited on the intracellular side of the cell
membrane of terminally differentiated stratified squamous epithelia.
The CE is thought to consist of a complex amalgam of proteins
cross-linked by isodipeptide bonds formed by the action of
transglutaminases, but little is known about how or in which order the
several putative proteins are cross-linked together. In this paper, CEs
purified from human foreskin epidermis were digested in two steps by
proteinase K, which released as soluble peptides about 30% and then
another 35% of CE protein mass, corresponding to approximately the
outer third (cytoplasmic surface) and middle third, respectively.
Following fractionation, 145 unique peptides containing two or more
sequences cross-linked by isodipeptide bond(s) were sequenced. Based on
these data, most (94% molar mass) of the outer third of CE structure
consists of intra- and interchain cross-linked loricrin, admixed with
SPR1 and SPR2 proteins as bridging cross-links between loricrin.
Likewise, the middle third of CE structure consists largely of
cross-linked loricrin and SPR proteins, but is mixed with the novel
protein elafin which also forms cross-bridges between loricrin. In
addition, cross-links involving loricrin and keratins 1, 2e, and 10 or
filaggrin were recovered in both levels. The data establish for the
first time that these several proteins are indeed cross-linked protein
components of the CE structure. In addition, the data support a model
for the intermediate to final stages of CE assembly: the proteins
elafin, SPR1 and SPR2, and loricrin begin to be deposited on a
preformed scaffold; later, elafin deposition decreases as loricrin and
SPR accumulation continues to effect final assembly. The recovery of
cross-links involving keratins further suggests that the subjacent
cytoplasmic keratin intermediate filament-filaggrin network is anchored
to the developing CE during these events. During the process of terminal differentiation in stratified
squamous epithelia such as the epidermis, a 15-nm thick layer of
protein is deposited on the intra-cellular surface of the cell
periphery. This cornified cell envelope (CE) However, the design of experiments to prove that these are indeed CE
structural proteins has proven difficult, because the isodipeptide
cross-link itself cannot be hydrolyzed to release the intact proteins
without use of reagents that also cleave peptide bonds. Likewise, few
data are available on the order of assembly of these proteins into the
CE, how they are cross-linked together, and which residues in the
proteins are utilized in cross-linking, Nevertheless, several
approaches have been explored. Indirect mathematical modelling has
provided important clues on the abundance of the proteins assembled in
epidermal CEs(17) . This was based on least-squares fitting
methods using the known amino acid compositions (from deduced cDNA
sequences) of most of the identified proteins listed above, and the
amino acid compositions of purified CEs. In this way, it was estimated
that loricrin is the major component (66%), together with smaller
amounts of cysteine-rich protein (14%), filaggrin (10%), SPRs (5%),
involucrin, and cystatin A third approach was
predicated on data which showed that low specificity proteases can
digest isolated CEs to their constituent amino acids and the
isodipeptide cross-link within 2-4
days(12, 32, 33, 34) . In a recent
study(35) , the time course of release of protein from
epidermal CEs during the first 36 h of trypsin and proteinase K
digestion was followed by amino acid analyses, and these data were then
subjected to mathematical modelling to estimate the protein contents
remaining in the insoluble CE remnants(17) . These experiments
were also followed by immunogold electron microscopy of the remnants
using a series of monospecific antibodies(35) . An initial
digestion with trypsin removed primarily keratin and filaggrin epitopes
from one side of the CEs. During 24 h of digestion with proteinase K,
primarily only loricrin, SPR, and the novel protein elafin were
removed, based on both amino acid composition and immunogold criteria.
The remnant was enriched in elafin, cystatin In this way, a
three-stage model for CE structure was suggested which is an
elaboration of the existing multi-stage
hypotheses(1, 2, 3, 4, 5, 9, 17) ,
in which: the outer (cytoplasmic surface) third of the CE consists
mostly of loricrin/SPRs/filaggrin; the middle third consists of
elafin/loricrin/SPRs; and the innermost third adjacent or attached to
the lipid envelope consists of involucrin and cystatin However, the most rigorous evidence
for the involvement of a protein in CE structure, that circumvents the
concerns of degraded, loosely-associated, or contaminating solubilized
proteins, would be to demonstrate identifiable protein sequences
directly adjoined by isodipeptide cross-links in isolated CEs, as has
been done in preliminary experiments for loricrin(12) . In this
study, the limited proteolytic digestion paradigm has been explored
further. A large number of peptides was harvested from CEs during the
first 9 h of digestion with proteinase K, roughly corresponding to the
outer two-thirds of CE structure, which were then subjected to amino
acid microsequencing. In this way, we show for the first time that
several CE proteins are indeed cross-linked components of the human
epidermal CE. The data also provide robust support for the proposed
modified model for CE structure.
During the course of the work, many minor peptide peaks
were identified which potentially contained useful sequence
information. To recover these in yields sufficient for sequencing (but
not amino acid analysis), the fractions between major collected peaks
from several HPLC runs were pooled and rerun.
Assignment of protein sequences to the released PTH-derivatives from
each cycle were based on the known protein sequences of several human
CE proteins, and are: elafin(24, 25) ,
filaggrin(40) , keratins 1(41) , 2e(42) , and
10(43) , loricrin(12) , and
SPRs(8, 10) . In other cases, sequences were
identified by computer searches of the Swiss Protein data base.
CEs were
initially digested with trypsin which released about 6% of CE protein
mass. Such peptides were too short for useful sequencing (<8
residues), but based on amino acid analyses, most were likely to be
filaggrin and perhaps some keratin. The virtual absence of Lys
suggested that they contained little or no cross-linked protein
material. Indeed, direct measurement of the total amount of
isodipeptide cross-link in the starting CE preparation revealed that
this fraction contained 0.5 nmol of cross-link/mg of protein,
corresponding to
Figure 1:
Fractionation of proteinase K peptides
by HPLC. A, 3 h; B, next 6-h digestion time. All
major and minor peaks were collected for analysis and microsequencing.
The acetonitrile gradient is shown.
Of the 197 peptides containing cross-links, the positions of
the Gln and Lys residues involved in the one or more isodipeptide
cross-links could be solved in 187, giving a total of 145 unique
peptides. In 10 other multi-branched peptides involving only loricrin
sequences, there was no unique solution to the assignment of which Gln
and Lys residues were cross-linked to each other in the several likely
cross-links. Of the 145 unique peptides containing one or more
cross-links, 144 contained at least one recognizable loricrin sequence,
of which 98 involved loricrin-loricrin cross-links only, and 46
involved loricrin cross-linked with another protein(s). One peptide
contained cross-links adjoining three non-loricrin sequences (see
below). The sequences of many peptides are presented in Tables I-IV.
Figure 2:
Molar utilization of Lys (upper
panel) and Gln (middle panel) residues in in vivo cross-links involving loricrin. The data were compiled from all
286 separate loricrin sequences in the peptides sequenced in this
study. The arrows designate those Lys and Gln residues that
were not utilized in the present body of data. The lower panel displays a predicted surface probability structure of
loricrin.
The cumulative data of Fig. 2show that a total of 75 nmol of lysines and 78 nmol of
glutamines in loricrin are used per mg of the outer (cytoplasmic)
two-thirds of CE protein, of which 81-94% (weighted mass average
86%) or 33 nmol is loricrin (molecular mass 26 kDa(12) ). This
means that from all of the loricrin-containing peptides sequenced here,
there were an average of
The total molar amount of
keratins and filaggrin cross-linked to the isolated CEs corresponds to
The idea that
elafin functions as a cross-bridging protein is further supported by
the recovery of two other peptides (Table 4) (yields 0.1 and
0.08%, respectively) in which elafin spans between keratin 1 (at
Lys
By direct amino acid sequencing of the many peptides recovered
in this way, we present for the first time direct evidence for the
eight proteins elafin, filaggrin, keratins 1, 2e, and 10 of KIF, SPR1,
SPR2, and desmoplakin, in addition to loricrin, as isodipeptide
cross-linked components of the human epidermal CE. Furthermore,
notable differences in the sequences of the peptides released in the
two proteinase K digestion intervals provide important information
about likely CE composition and structure.
The second
6-h proteinase K digestion interval released peptides that probably
originate from the central third of CE structure. While loricrin was
again the quantitatively major component (about 81%), and SPRs
represented 5%, this time interval was notable for the appearance of
many peptides involving the novel protein elafin or its precursor
(about 11%). As for the SPRs, it appears that elafin functions as a
cross-bridging protein among loricrin. Thus the present data support an
earlier hypothesis for the role of elafin in skin cross-linking
reactions(48) . These data provide robust support for our
new model on especially the latter stages of CE assembly (Fig. 3). Unknown early initiation stages of CE assembly
presumably involve deposition of involucrin, cystatin
Figure 3:
Model of structure for the outer
two-thirds of the human foreskin epidermal CE. The outermost
(cytoplasmic surface) consists almost entirely of loricrin (large
circles) enmeshed by SPRs (horizontal ovoids). Below the
surface, significant amounts of elafin (vertical ovoids) is
present. Keratin intermediate filaments bound together by filaggrin (small circles) are associated at these levels. These proteins
are deposited over a scaffold of unknown structure likely to consist of
involucrin, cystatin
One implication of this model for the latter
stages of CE assembly and structure is that the admixture of SPRs and
elafin with loricrin would likely alter the flexibility characteristics
of the loricrin array. This might constitute a novel method for
altering the physical properties of the cornified layers of the
epidermis. Indeed, elafin, SPR1, SPR2, and especially SPR3 are induced
in response to epidermal injury (such as by UV light or drugs) or in
hyperproliferative disorders (such as
psoriasis)(7, 8, 21, 22, 23) ,
but it remains to be seen whether the CEs formed in these cases contain
increased levels of the proteins. Further studies on the structure and
expression of each of these proteins will be necessary to test this
hypothesis. The cross-links involving keratin and filaggrin are also
of interest. The total molar amounts recovered are <0.5% of CE
protein mass, which corresponds to <0.05% of the estimated total
amount of keratin/filaggrin in the cornified cell. It is possible that
these have arisen as a result of random cross-linking by
transglutaminases, and do not represent a physiologically important
event in CE assembly; that is, they may have arisen as predicted by the
``dustbin'' hypothesis (3, 53) . However, we
consider this unlikely for two reasons. First, there was remarkable
specificity in the lysines used in the keratin chains: of the 26 or 27
lysines in the keratin 1 or 2e chains, only the Lys The implications of the
present identification of a single peptide involving desmoplakin
sequences as a potential CE protein constituent are not yet clear.
Because the sequence to which desmoplakin was cross-linked involved the
commonly used elafin Gln
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES
(
)plays a critical role in barrier function of the
tissue and for the organism(1, 2, 3) . In the
case of the epidermal CE, several distinct proteins have now been
identified and characterized as potential CE components, including:
involucrin(4, 5) , cystatin
(6, 7) , several small proline-rich proteins
(SPR1, SPR2, in epidermis as well as SPR3 in cultured
keratinocytes)(7, 8, 9, 10) ,
loricrin(11, 12, 13, 14) , and
possibly trichohyalin(15) ,
filaggrin(16, 17) , keratin intermediate filaments
(KIF)(17, 18, 19) , and a putative
cysteine-rich protein (20) that may be
elafin(20, 21, 22, 23, 24, 25) .
These proteins are thought to be cross-linked together to assemble the
CE by way of disulfide bonds(12) , as well as the N
-(
-glutamy)lysine isodipeptide
cross-link, formed by the action of one or more of the three known
epidermal
transglutaminases(1, 2, 3, 4, 5, 9, 12, 17, 26) .
(2-5% each), but no detectable
KIF. At the time of these calculations, the exact amino acid
composition of cysteine-rich protein was not known. However, assuming
that the well-characterized protein elafin of known
sequence(24, 25) is in fact cysteine-rich protein,
then the data (of (17) ) can be recalculated to: loricrin 70%,
filaggrin 8%, elafin 6%, SPRs and cystatin
5% each, and
involucrin and KIF about 2% each. This large amount of loricrin is
consistent with the abundance of its mRNA in the
epidermis(11, 12) . Furthermore, transgenic
experiments (14) have suggested that loricrin is the last or
one of the last components added during CE assembly by addition to a
pre-existing scaffold of such proteins as involucrin and cystatin
. Another approach to address these questions has been to attack
isolated CEs with chemicals such as CNBr or solvents to release protein
species for characterization by Western blotting or other biochemical
techniques. This has been used to show that involucrin is a component
of the CE of cultured epidermal keratinocytes(27) . This method
also released proteins termed pancornulins that are probably the same
as the SPRs(28, 29, 30) . Likewise,
immunological data with existing protein-specific antibodies have
identified fragments of keratins and filaggrin in isolated CE
preparations(28, 31) . Clearly, each of these studies
suggest association with but do not necessarily prove attachment of
these proteins to purified CEs. This is an important problem because of
the difficulty in preparing purified CEs from either cultured
keratinocytes or intact epidermis free of soluble cytoplasmic
proteins(12, 17, 27) , or proteins
solubilized during the isolation procedures.
, and involucrin, of
which involucrin epitopes were still abundantly available on one
side(35) . The appearance and disappearance of gold particles
on only one side of the CE remnants provides strong support for the
idea that the proteases were removing CE structural proteins from the
cytoplasmic surface(36) . Finally, after 36 h of proteinase K
digestion, cystatin
and involucrin comprised most of the remnant,
but in this case, involucrin epitopes were visible on both
sides(35) , consistent with the idea that it may be associated
directly to the lipid envelope(37) .
and
perhaps other as yet unknown proteins. Also KIF may be buried
throughout the CE (35) .
Preparation and Purification of Human Epidermal
CEs
CEs were prepared from the stratum corneum of human
foreskins. The epidermis was separated by heat treatment by standard
procedures and then extracted in a buffer of 8 M urea
containing 50 mM Tris-HCl (pH 7.6) and 1 mM EDTA. In
the absence of reducing agent, this buffer dissolves only the inner
living layers and perhaps the transition layer of the epidermis,
leaving the stratum corneum intact (32, 38) . The
extract was then filtered through nylon gauze (39) (mesh size
0.1 mm) to collect the stratum corneum sheets. CEs were then prepared
from this material by exhaustive boiling and sonication in 2% SDS, 20
mM dithiothreitol, 0.1 M Tris-HCl (pH 8.0), and 5
mM EDTA as described
previously(12, 17, 35) . Following
purification by pelleting at 5,000 g through a 20%
Ficoll solution in the same buffer, the CE fragments were washed three
times in phosphate-buffered saline to remove the bulk of the SDS and
reducing reagent. Amino acid analysis of acid-hydrolyzed samples was
used to confirm the purity of the CEs(16) .
Isolation and Quantitation of the Isodipeptide
Cross-link
Aliquots of CE preparations were subjected to total
enzymic digestion over a period of 4 days exactly as described
previously(12, 32, 33, 34) . Control
reactions contained only enzymes. The products were resolved by amino
acid analysis to measure the amount of the N-(
-glutamyl)lysine cross-link (Accurate
Biochemical Corp.), which elutes at 29.0 min on a Beckman 6300
analyzer, immediately after methionine. The molar content of the
cross-link was then calculated from the total amount of CE protein as
determined by acid hydrolysis (in vacuo at 106 °C for
22-48 h). In the combined batches of foreskin epidermal CEs used
in this work this was 89 nmol/mg CE protein.
Proteolytic Digestion of CEs and Fractionation of
Peptides
CEs were resuspended (1 mg/ml) in a buffer of 50 mM Tris-HCl (pH 8.0) and 5 mM CaCl and first
digested with stirring with trypsin (Sigma, sequencing grade, 1% by
weight, at 37 °C) for 6 h(35) . Following removal of the
solubilized material by centrifugation at 14,000
g for
10 min, and washing in buffer, the CE remnants were resuspended in the
same buffer and digested with proteinase K (Promega, 3% by weight, at
37 °C) for 3 h. The CE remnant was pelleted, washed, and redigested
for a 6-h time interval. This stepwise digestion procedure was repeated
for a total of five 6-h intervals (33 h digestion in total). The amount
of soluble peptide material released into the supernatant of each time
interval was quantitated by amino acid analysis. Aliquots were resolved
by HPLC using a reverse-phase ultrapshere ODS column (4.5
250
mm) with a gradient of 0-100% acetonitrile containing 0.08%
trifluoroacetic acid. Conditions of digestion with proteinase K were
optimized so as to release the maximal amount of protein from the CEs,
in order to yield cross-linked peptides that were still sufficiently
long to obtain unambiguous sequence information about the cross-linked
protein(s). The size of peptides was determined empirically from their
times of elution from the HPLC column. In general, peptides which
eluted with <25% acetonitrile (<50 min) were <8 residues long
and few contained cross-links, while those which eluted >30%
acetonitrile contained cross-links and were longer than 10 residues. In
this way, it was found that digestion with 3% proteinase K during the
first two time intervals (3 h followed by 6 h) generated peptides that
provided useful sequence information. Each collected peptide was
neutralized with trimethylamine and concentrated to about 10
pmol/µl.
Amino Acid Sequencing
Amino acid compositions of
aliquots of resolved peaks were measured when quantities permitted to
identify those peptide species that contained at least 1 Glx and 1 Lys
residue, which were thus candidates for containing an isodipeptide
cross-link. Candidate peptides (0.5-200 pmol) were then sequenced
for 5-15 Edman degradation cycles in a LF3000 gas-phase sequencer
and following the manufacturer's specifications (Porton). In most
cases, the peptides were first covalently attached to a polyvinylidine
difluoride solid support (Sequelon-AA, Millipore). The released PTH
residues of each cycle were resolved and quantitated by on-line HPLC to
elucidate the sequences (Beckman Instruments, using System Gold
software). The diPTH-isodipeptide species eluted near PTH-Leu. The
PTH-derivative of cysteine is unstable and could not be ascertained
directly, but was inferred from the appearance of PTH-dehydroalanine. Computer Analyses
Secondary protein structure
predictions were done using a suite of algorithms compiled by the
University of Wisconsin Genetics Computer Group (44, 45) and the IBI Pustell sequence software version
4.0 (International Biotechnologies Inc).
Release and Fractionation of Peptide Material from
CEs
The purpose of the present study was to more rigorously test
a recent three-stage elaboration of existing models for the assembly
and structure of the human epidermal CE(35) . Proteinase K
digestion procedures were modified so as to recover peptides suitable
for microsequencing, which thus could yield unambiguous information
about cross-linked protein constituents and CE structure.0.6% of the total. Because keratin and filaggrin
are the major proteins of the epidermal tissue, it seems possible that
this trypsin digestion is simply removing contaminating solubilized
epidermal proteins(35) . Subsequent release of useful
cross-linked peptide material was done by titration of the time of
proteinase K digestion, followed by generation of an HPLC profile, and
measurement of the amino acid compositions of the peptide products
recovered. In this way, it was found that digestion for a 3-h time
interval released 30% of CE protein mass as peptides that were up to 50
residues long (Fig. 1A). A subsequent 6-h digestion
released another 35% of CE protein mass and these peptides were up to
40 residues long (Fig. 1B). Likewise, measurement of
the total amounts of isodipeptide cross-link revealed that the 3- and
then 6-h digests released 35 nmol/mg total CE protein (39%) and 36.4
nmol/mg (41%), respectively. Further 6-h digestion intervals for a
total of 33 h released an additional 24% of CE protein mass, and 14
nmol of cross-links/mg of CEs, but these peptides were <10 residues
long and were too short for unambiguous sequencing (data not shown).
Together with previous immunogold analyses of the fragments recovered
in these times(35) , these data mean that the peptides released
in the first 3 h arose from roughly the outer (cytoplasmic) third of
the CE, and the second 6-h digestion released material from the middle
third.
Peptides from the Outer (Cytoplasmic) Third of CE
Structure Are Mostly Loricrin and SPRs
From the 3-h digests, 133
peptides were identified, recovered, and sequenced from two similar
proteinase K digestion experiments, of which 91 contained one or more
cross-links. These cross-linked peptides varied in amounts from
0.5 to 850 pmol, for a total of 27.5 nmol, and contained 31.5 nmol
of cross-links or an average of 1.15 cross-links/mol. They accounted
for a total of 90% of isodipeptide cross-links in this fraction (31.5
of 35 nmol cross-link/mg of CEs). This means that
10% of the
cross-links in the 3-h fraction were not found in the harvested and
sequenced peptides. Presumably this small amount was lost as short
peptides that eluted before 50 min on the HPLC column. However,
analyses of the 90% recovered are likely to provide relevant and useful
data on the composition and structure of this level of the CE. When
expressed as a percentage of the 27.5 nmol of total peptides, it was
found that 94% of the molar mass involved loricrin sequences (all 91
peptides), 3% were SPR1 (7 peptides), 2% were SPR2 (4 peptides), 1% was
elafin (1 peptide), and trace amounts were due to keratin 1 (0.05%) (4
peptides) and filaggrin (0.08%) (1 peptide) sequences.
Peptides from the Central Third of CE nvolve Elafin as Well as Loricrin and SPRs
An
additional 143 peptides (118 with cross-links) were characterized from
the subsequent 6-h digestion interval. Of these 118, 64 were identical
to those obtained in the 3-h digestion and 54 were unique. Most of the
64 common peptides involved loricrin-loricrin cross-linked species. In
this case, 33.2 nmol of isodipeptide cross-links (91% of the total
amount) could be accounted for in the harvested and sequenced peptides,
and the total molar mass of characterized peptides was 34.8 nmol
(average of 1.05 mol of cross-links/mol). Of the molar total of
peptides, 81% involved loricrin sequences (117 of 118, 24 new
peptides), 11% were elafin or preproelafin (17 new peptides), 3% were
SPR1 (4 new peptides), 2% were SPR2 (3 new peptides), 0.1% were total
keratins (4 new peptides), 0.1% (2 new peptides) were filaggrin, and a
single peptide involved desmoplakin I or II (0.1%). Thus, while most of
the peptides were still composed of loricrin at this digestion time
interval, a notable difference was the appearance of numerous
cross-linked peptides involving elafin.Intra- and/or Intermolecular Loricrin
Cross-links
The molar amounts of each of the 14 Gln and 7 Lys
residues of loricrin that were used in cross-links was calculated from
the yields of the 286 separate loricrin sequences identified in all of
the peptides solved here (Fig. 2). Of seven possible Lys
residues, the terminal Lys accounted for 57% of the molar
total, while two positions (residues 17 and 296) were not used at all.
In the case of Gln residues, 70% of the molar total was attributable to
residues 215 and 216, and residue position 158 was not used. These data
are consistent with the predicted secondary structure of loricrin (Fig. 2): the most used residues were located in sequences
likely to be exposed on the surface, while unused residues (residues
17, 158, and 296), and infrequently used residues are predicted to be
buried in or near the glycine loop motifs.
2.3 mol of isodipeptide cross-links/mol
of loricrin in the outer portions of the CE. However, the bulk of the
sequence data cannot specify whether the cross-links are intrachain,
interchain, or both, although 5 solved peptides (Table 1) and 10
unsolved peptides clearly involved interchain cross-links since the
same terminal residues were found multiple times.
Cross-links Involving SPR1 and SPR2
A total of 19
peptides cross-linked with loricrin involved interchain cross-links
with the SPR1 and SPR2 proteins (molar amount about 5%) (Table 2), and were approximately equally distributed between the
two digestion time intervals. Since their initial discovery, these
proteins have been predicted to be CE precursors, on the basis of
sequence and immunological
data(7, 8, 9, 10) . This idea is
also supported by a recent immunogold decoration study(35) .
The present data confirm this hypothesis for the first time. In
addition to Gln- and Lys-rich terminal sequences that are homologous to
loricrin(10, 46) , the notable feature of their
sequences is the presence of a conserved octapeptide motif repeated 6
times in SPR1 and a nonapeptide motif repeated 3 times in SPR2. These
repeats also contain Gln and Lys residues (8, 9, 10) which have been postulated to
provide additional cross-linking
sites(9, 10, 46) . In the present data,
however, no peptides involved cross-links with the peptide repeat
sequences: 18 of 19 peptides used either the carboxyl-terminal or
penultimate carboxyl-terminal residue only; in the other peptide, SPR1
Lys was used. As for loricrin, both these sequences are
predicted to be highly exposed for reaction (data not shown). These
data strongly suggest that in fact the SPRs are serving as
cross-bridging proteins among and between the more numerous loricrin
molecules, in confirmation of an earlier prediction(9) .
Cross-links Involving Keratin and Filaggrin
A
quantitatively very minor portion of the total peptides recovered here
involved eight cross-links with keratin 1, 2e, or 10 chains (totalling
about 60 pmol or about 0.15% on a molar basis) (Table 3). The
lysine residue used in the keratin 1 and 2e chains is situated in the
V1 amino-terminal end domain sequences in a 20-residue window highly
conserved among the Type II keratins expressed in stratified squamous
epithelia(47) . The cross-link involving keratin 10 was located
close to its amino terminus. Three cross-links between loricrin and
filaggrin (0.1% molar yield) (Table 3) utilized a Gln-Gln
dipeptide sequence in filaggrin in a narrow conserved sequence window
in a protein of otherwise highly variable
sequence(40, 47) . In the case of both the keratin
chains and filaggrin, these data confirm for the first time their
direct involvement in CE structure.
0.1% each, which means that on a weight basis, <0.5% of the
total CE protein consists of these proteins directly cross-linked by
isodipeptide bonds. Based on mathematical modelling of amino acid
compositions, it was estimated (17, 35) that keratins
and filaggrin constitute about 3 and 7%, respectively, of the total
``purified'' CE protein mass. These data mean that most of
the keratin and filaggrin present in the initial CE preparations was
not cross-linked, but presumably was retained as contaminating adherent
protein.
Cross-links Involving Elafin
An additional 18
peptides contained cross-links involving the novel protein elafin (Table 4). They were almost entirely recovered from the second
digestion time interval, corresponding roughly to the middle third of
CE structure (Table 4). Elafin is an abundant differentiation
protein product of the epidermis and
skin(20, 21, 22, 23, 24, 25, 48) ,
and the present data document for the first time its role in the
epidermis as an important CE component. The precursor of elafin,
preproelafin of 11 kDa, contains four conserved 12-residue peptide
repeats, each of which contains Lys and Gln residues. This motif is
similar to the cross-linking region of the seminal vesicle protein
substrate of the prostate TGase 4 enzyme (49) . Also, model
synthetic peptides of this sequence motif can participate in
transglutaminase cross-linking reactions in
vitro(25) . Mature elafin (6 kDa) constitutes the Cys-rich
carboxyl-terminal portion of the precursor, and is produced by cleavage
in the fourth repeat. In most of the peptides discovered here, the
elafin appears to serve as an intermediary or cross-bridging protein
between two other protein sequences, which in fact, is consistent with
its suggested role in cross-linking reactions(48) . Most of the
peptides utilized Gln and Lys residue sites located near the predicted
amino terminus of mature elafin, or at its carboxyl-terminal end.
Interestingly, three peptides were found to involve the precursor form,
using Lys or Gln residue sites in the last 12-residue repeat just prior
to the point of cleavage to generate mature elafin.
) and loricrin molecules, and between keratin 1 and
desmoplakin I/II (at Gln
). In the case of desmoplakin,
this is the last Gln residue, located near its carboxyl
terminus(50) . This is the first report of the linkage of
desmoplakin to the CE. Desmoplakin is a major structural protein
component of desmosomes of many types of cells, including epithelia. It
is a flexible rod >100 nm long which projects into the cytoplasm
such that its carboxyl-terminal domain may be available for direct (50, 51) or indirect (50) association with the
KIF cytoskeleton. The present finding would seem to support the idea of
an indirect association through elafin.
Definition of a CE Constituent Protein
The
functional definition we have used throughout this and earlier studies (12, 17, 35) for the involvement of proteins
in the CE is the documentation of an identifiable protein sequence
cross-linked by the transglutaminase-catalyzed isodipeptide bond. The
use of this rigorous definition is important for two reasons. First, it
has proven very difficult to recover CEs free from soluble cytoplasmic
proteins, or abundant epidermal proteins such as keratins and filaggrin
that are solubilized by the obligatory exhaustive extraction
procedures(12, 17, 27) . Indeed, we have
found that a simple trypsin digestion can remove 6-8% of purified
CE protein mass as short peptides. Based on direct cross-link data
adduced here (Table 3), most of this probably originates from
contaminating filaggrin and keratin. The second reason is an attempt to
confirm rigorously a number of reports in the literature concerning
likely or putative CE protein constituents. Many studies to date have
employed methods such as indirect immunofluorescence at the light
microscope level using antibodies to suspected CE proteins. A cell
peripheral staining pattern was taken as evidence for involvement of
the protein in the CE(3) . However, the resolution of this
method is limited to the wavelength of light, that is, roughly 500 nm.
Because this is many times the width of the CE structure itself, there
remains some doubt as to whether the putative protein is in fact
involved in the CE structure, or located nearby on some other structure
instead. Labeling with immunogold-tagged antibodies has far greater
resolution of course, but so far, rigorously controlled studies have
been performed for only involucrin(5, 35) ,
loricrin(11, 12, 35, 36) , and
SPR2(35) .Characterization of Peptide Material from CEs
A
previous study from this laboratory which met the above rigorous
criterion was the first to document a protein cross-linked to the human
epidermal CE in vivo(12) . Four peptides were
recovered by limited proteolysis that contained two loricrin sequences
adjoined by the isodipeptide cross-link. More recently, the efficacy of
this method to ascertain CE protein composition and structure has been
extended(35) . Limited progressive digestion with proteinase K
for 36 h could release up to 90% of the CE protein mass as soluble
peptides and thereby provided important clues about the possible order
of assembly of the CE constituent proteins. Based on amino acid
compositions of the released material, together with immunogold
labeling of CE fragments and remnants recovered during the
digestions(35) , the protein was apparently removed from the
cytoplasmic side rather than the lipid envelope side. That is, the
progressive digestion protocol excavated ``deeper'' into the
CE structure. The purpose of the present study was to adapt the
digestion procedures so as to release peptides that were suitable for
amino acid microsequencing in order to (i) confirm for the first time
that certain putative proteins are indeed cross-linked components of
the CE; and (ii) provide rigorous information about CE structure and
assembly. Optimized digestion conditions involved three steps: an
initial trypsin digestion removed apparently contaminating
non-cross-linked proteins; a 3-h proteinase K digestion released the
outer (cytoplasmic) 30% of CE protein mass; and a second 6-h digestion
interval released another 35%, corresponding to a middle third of CE
mass.A Model for the Structure of the Cytoplasmic (Outer)
Two-thirds of the CE
Peptides released in the first 3-h
proteinase K digestion interval, corresponding to the outermost 30% of
CE structure, consisted almost exclusively of loricrin (94%) and SPRs
(5%). These data provide the strongest evidence to date that loricrin
is the major component of the human epidermal CE. While almost every
lysine and glutamine residue of loricrin was employed in the identified
cross-links, the majority of the cross-links recovered utilized
Gln, Gln
, and Lys
(Fig. 2, Table 1) in inter- and/or intrachain bonds.
The intrachain cross-links would necessarily fold the molecule into a
compact form. The predicted secondary surface structural features of
loricrin indicate there are four glycine loop domains, flanked or
interspersed by five Lys- or Gln-Lys-rich segments each of which
harbors multiple utilized cross-linking sites (12) (Fig. 2). Because the glycine loop motifs are
likely to be highly flexible in structure(12, 52) ,
the present data indicate that the bulk of the CE consists of a
flexible three-dimensional mesh-like array of compact cross-linked
loricrin molecules. In addition, this array is also interspersed by
smaller amounts of SPR1 and -2 (2-3% each). Interestingly, all of
the 19 peptides involving SPRs utilized Lys or Gln residues located on
or near their termini, rather than the multiple internal residues of
each protein (Table 2). This data base supports the notion (9) that these proteins function as intermediary cross-bridges
between and among the much larger amounts of loricrin.
, and
possibly other proteins, corresponding to the innermost third of the
CE(3, 17, 35) , and these are (or become)
attached to the lipid envelope(37) . Our new data indicate that
a second major step involves addition to this initial scaffold of
elafin, SPRs, and loricrin, corresponding approximately to the middle
third of CE structure. Later, it appears that elafin deposition is
reduced, while that of the SPRs and loricrin continues in a final
stabilization event, corresponding to the outer third (cytoplasmic
surface) of the CE.
, and perhaps other as yet unknown proteins.
Together, this isodipeptide cross-linked proteinaceous component of the
CE is attached to the lipid
envelope(37) .
or
Lys
, respectively, were utilized (Table 3), which
are located in a highly conserved window of sequences near the amino
termini of the chains. This specificity would not be the anticipated
result of the dustbin hypothesis. Moreover, it has been shown recently
that a single-point mutation affecting this lysine residue in keratin 1
(Lys
Ile) is the cause of one case of the skin
disease non-epidermolytic palmaplantar keratoderma (54) .
Indeed, these disease data together with the present finding support
the alternative idea that this Lys residue has been precisely conserved
so that it may be available for transglutaminase cross-linking
reactions. Our data suggest that one such function may be linkage of
the KIF cytoskeleton to the CE by the isodipeptide bond. This may
provide a means for anchorage of the two structures and coordination of
cornified epidermal cell structure. Further experiments are in progress
to test this concept. Likewise, the three cross-links involving
filaggrin utilized Gln residues in a highly conserved window of
filaggrin sequences, suggesting specificity of transglutaminase
reaction. This conclusion is not inconsistent with data suggesting that
the bulk of filaggrin is loosely associated with and may be a
contaminant of the isolated CEs: some filaggrin in the form of the
filament-matrix component of the bulk of the epidermal cell mass is,
like the keratins, also attached to the CE.
/Lys
residues and
keratin 1 Lys
residue (Table 4), it seems unlikely
that this is a dustbin peptide. However, it must be pointed out that
two other important CE proteins, involucrin and cystatin
, were
not seen in the present data set, despite the fact that they are
thought to be quantitatively major CE protein
components(1, 2, 3, 4, 5, 17, 35) .
The expectation is that these two proteins occupy the inner portion of
the CE, perhaps attached directly to the lipid
envelope(17, 35, 37) . It is possible that
desmoplakin (and other proteins) may also constitute an inner component
of CE structure. Further work is in progress to explore these issues.
Utilization of Lysine and Glutamine Residues in
Cross-links
While a large body of in vitro data has
been accumulated on the sequence specificity adjacent to lysines or
glutamines required for the TGase 2 and factor XIIIa enzymes to form
cross-links(55) , little in vivo information is
available, in part because few natural cross-linked structures have
been explored to date, and in part because of the likelihood that the
different TGase family members may have different substrate
specificities. Only limited studies have been performed so far on the
TGase 1 (56) and TGase 3 (57) enzymes that operate in
the epidermis. The present body of cross-link data for several
substrates (loricrin, SPRs, elafin, and keratins) indicate a clear
preference for terminal sequence regions, and/or sequences that are
predicted to be exposed on the protein surface. Similar observations on
the in vitro cross-linking of eye lens -crystallins have
shown a clear preference for exposed terminal sequences(58) . In vitro data using model peptide substrates have shown that
the first Gln residue of adjacent Gln-Gln or Gln-rich sequences flanked
by hydrophobic residues offer favorable amine acceptors for the TGase 2
enzyme(59, 60) . This is the case in the present in vivo loricrin cross-link data: Gln
of
Gln
-Gln
-Val
was used more
often than either Gln of
Gln
-Gln
-Lys
, or Gln-rich
regions 3-10, 153-158, and 303-308 (Fig. 2).
As more data accumulate on in vivo cross-links of natural
substrates, better data on the properties and specificities of the
TGases should become possible.
Concluding Remarks
The present data have provided
evidence for the first time on the direct involvement of several
proteins in CE structure. In addition, these data afford rigorous
support for our modified model (35) on the structure of the
outer cytoplasmic two-thirds of the human epidermal CE, which
correspond to the terminal reinforcement or maturation steps of CE
assembly(1, 2, 3, 5, 9, 17) .
It remains for the future to design experiments to explore the earlier
initial stages of CE assembly, presumably involving deposition of
involucrin, cystatin , and perhaps elafin, desmoplakin, KIF, and
other proteins. Further carefully controlled digestion procedures using
alternative enzymes may be able to ``dig'' deeper into the CE
and recover suitable peptide material from lower levels. An alternative
approach may be to use ``immature'' envelopes from cultured
keratinocytes, in which only the earliest stages of CE formation have
occurred, prior to the massive deposition of loricrin(17) .
We thank Drs. Eleonora Candi, Soo-Il Chung, Laszlo
Fesus, John Folk, Tonja Kartasova, Ulrike Lichti, Edit Tarcsa, and
Stuart Yuspa for their helpful comments during this work. William
Lanahan (Beckman Instruments) provided valuable technical support and
advice.
©1995 by The American Society for Biochemistry and Molecular Biology, Inc.