Differential Extraction and Protein Sequencing Reveals Major Differences in Patterns of Primary Cell Wall Proteins from Plants*

(Received for publication, February 13, 1997, and in revised form, April 17, 1997)

Duncan Robertson Dagger , Geoffrey P. Mitchell §, John S. Gilroy , Chris Gerrish §, G. Paul Bolwell § and Antoni R. Slabas Dagger par

From the Department of Biological Sciences, Durham University, South Road, Durham DH1 3LE and the § Division of Biochemistry, School of Biological Sciences, Royal Holloway and Bedford New College, University of London, Egham, Surrey TW20 0EX, United Kingdom

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
REFERENCES


ABSTRACT

The proteins of the primary cell walls of suspension cultured cells of five plant species, Arabidopsis, carrot, French bean, tomato, and tobacco, have been compared. The approach that has been adopted is differential extraction followed by SDS-polyacrylamide gel electrophoresis (PAGE), rather than two-dimensional gel analysis, to facilitate protein sequencing. Whole cells were washed sequentially with the following aqueous solutions, CaCl2, CDTA (cyclohexane diaminotetraacetic acid, DTT (dithiothreitol), NaCl, and borate. SDS-PAGE analysis showed consistent differences between species. From the 233 proteins that were selected for sequencing, 63% gave N-terminal data. This analysis shows that (i) patterns of proteins revealed by SDS-PAGE are strikingly different for all five species, (ii) a large number of these proteins cannot be identified by data base searches indicating that a significant proportion of wall proteins have not been previously described, (iii) the major proteins that can be identified belong to very different classes of proteins, (iv) the majority of proteins found in the extracellular growth media are absent from their respective cell wall extracts, and (v) the results of the extraction process are indicative of higher order structure. It appears that aspects of speciation reside in the complement of extracellular wall proteins. The data represent a protein resource for cell wall studies complementary to EST (expressed sequence tag) and DNA sequencing strategies.


INTRODUCTION

The plant cell wall is a dynamic system generally considered to be composed of more than 90% carbohydrate polymers. Proteins, phenolics and possibly lipids make up the remainder of the wall (1-3). To date, most research interest has been in the carbohydrate components because of considerations of their structural role and commercial interest. This has led to a number of models for the integration, interpolymeric association, and assembly of the wall (3, 4). By comparison, our knowledge of the complexity of protein in the plant cell wall is in a less advanced state. Much of the understanding of the range of structural wall proteins has come from cDNA and genomic cloning exercises and has led to the identification of glycine-, cysteine-, proline-, and hydroxyproline-rich subsets of wall proteins. In addition, many extracellular enzymes have been identified that are required for the restructuring and modification of this dynamic extracellular matrix which underpin its role in defense, detoxification, signaling, cell-cell recognition, cell expansion, cell adhesion, cell separation, translocation, differentiation, and morphogenesis (2, 5, 6). However, there is a lack of direct studies on the proteins themselves and the true range of extracellular proteins and their species differences remains to be elucidated. The present work describes the systematic extraction and sequencing of the major primary wall proteins from five species representing four families of plants.

Since whole plant tissue is complicated by the presence of different tissue types including lignified secondary walls, we have chosen to use suspension cultures as a source of experimental material due to their relative uniformity. There are a large number of studies which show that molecular probes for proteins and enzymes derived from tissue cultured cells locate in a predictable way in the intact plant, and thus tissue cultures can be a reliable guide to a substantial number of phenomena in the intact plant. To cover a diverse range of species, we have used suspension cultures of Arabidopsis, carrot, French bean, tomato, and tobacco. The reasons for choosing these species relates to their academic and commercial interest. Arabidopsis has not been as extensively used in tissue culture but is currently the target of an extensive EST1 sequencing exercise, and an international program aimed at the sequencing of its entire genome is under way (7, 8). Carrot, a member of the Umbelliferae, has been an important species in modeling embryogenesis and elongation growth (9). Tissue cultures of the leguminous species, French bean, have been extensively used as a model system for cell wall biosynthesis and modifications during responses to pathogens (2). The two solanaceous species tobacco and tomato allow a comparison of the extent of conservation of wall proteins within one family. Tomato is important commercially and was the subject of the first successful attempts to modify the expression of wall proteins by transformation (10). Tobacco is an important model plant for attempts to modify cellulose extractability by modifying lignification (11).

The aim is to generate a protein resource for the plant cell wall analogous to current efforts of EST and genomic sequencing, so allowing for the future identification of potential biochemical function arising from these exercises. Homology searches of the derived amino acid sequence data from the present study should provide a firm indication of the number of components of the cell wall to which function is still to be ascribed.


MATERIALS AND METHODS

Plant Material

The derivation and maintenance of cultures of Arabidopsis (12), tobacco (13), tomato (14), and French bean (15) have been described previously. The carrot cultures (previously unpublished) were grown in Murashige and Skoog (16) basal salts supplemented with 3% sucrose and 2 mg/ml 2,4-D. Suspension cultures were grown in 100-ml batches in 250-ml Erlenmeyer flasks under a 16-h photoperiod (Arabidopsis, tomato, and carrot) or in the dark (French bean and tobacco) at 24 °C while rotary shaken at 130 rpm and subcultured every 7-10 days.

Extraction and Preparation of Cell Wall Proteins

Cells growing 4-5 days after subculture were harvested by filtration on Miracloth. The cells were washed three times with dH2O (3 ml/g fr. weight). All subsequent manipulations were conducted at 4 °C. The cells were stirred in three volumes of 0.2 M CaCl2 for 30 min and collected by filtration on Miracloth and washed three more times with dH2O as before. Subsequent extractions, each with three volumes, were carried out sequentially on the same cells for 30 min each, with 50 mM CDTA in 50 mM sodium acetate, pH 6.5, followed by 2 mM DTT, 1 M NaCl, and finally 0.2 M borate, pH 7.5. The borate extraction was conducted at room temperature. Between extractions cells were washed on the filter three times, each with three volumes of dH2O. Extracts and the culture media were refiltered through GF/A paper before being dialyzed against a 10-fold excess of dH2O with three changes. The samples were lyophilized and reconstituted in SDS-PAGE loading buffer (17) prior to analysis. The lyophilized culture filtrates were reconstituted at 30 µg/µl, whereas all the other extracts were reconstituted at 15 µg/µl in gel loading buffer.

Protein Analysis

SDS-PAGE was carried out on 10% gels, using Bio-Rad Mini-PROTEAN II apparatus and stained with Coomassie Brilliant Blue (17). Since protein recoveries varied between the different extracts within each species and also between species the amount of extract loaded per lane was optimized by SDS-PAGE: typically between 10 and 20 µl of reconstituted extract was used per lane. For sequencing purposes, this process was also necessary due to the complexity and uneven abundance of the components within each extract, and more than one loading was used to maximize the acquisition of sequence data for the minor protein bands. For N-terminal amino acid sequence determination, proteins were transferred onto Problot membrane (Applied Biosystems, Inc. (ABI), Foster City, CA) and visualized by Coomassie Blue staining as directed in the Problot manual. Amino acid sequence analysis was performed in an ABI model 477 Sequencer. Sequence similarities were determined using the electronic mail "BLAST" series of programs (18) against the non-redundant protein, non-redundant DNA, and non-redundant EST data bases maintained by the National Center for Biotechnology Information.


RESULTS

Analytical Strategy

Although there have been a number of systematic protein sequencing projects in plants, usually based on two-dimensional isoelectric focusing/SDS-PAGE systems of whole plant extracts (19-22), these are limited in the amount of information that was obtained due to mass limitations of the resolved material. Whereas the resolving power of two-dimensional gels is much higher than that of SDS-PAGE gels, the loading of each band per gel is higher for the latter generating sufficient mass for routine N-terminal analysis; the main problem is resolution. Accordingly we have chosen to reduce the complexity of the initial mixture of proteins by differentially extracting the cell wall from intact suspension cultured cells successively with CaCl2, CDTA, DTT, NaCl, and borate. The rationale for the initial wash was based on the successful use of CaCl2 to extract wall proteins (23). CDTA would be expected to remove any proteins associated with the pectin fraction since calcium would promote such binding. DTT was chosen to reduce any protein-protein interactions based upon cysteine disulfide bonds. NaCl was used to extract any strongly ionically bound proteins. Finally, borate was used to disrupt any interactions due to glycoprotein side chains and other saccharides in the wall. At all stages, intervening water washes were used. For all species, secreted proteins were also characterized in the culture filtrate. Microscopic examination of the cells after these extractions showed the cells to be plasmolyzed, demonstrating that the plasma membrane of the cells remained intact, indicating that none of the extracted components were cytosolic.

Extraction and Patterns of Cell Wall Proteins

SDS-PAGE analysis reveals that the subsets of wall proteins obtained by successively washing whole cells are distinct for each reagent employed and for each species. Sequential extracts are shown for Arabidopsis (Fig. 1A), carrot (Fig. 1B), French bean (Fig. 1C), tomato (Fig. 1D), and tobacco (Fig. 1E). It can be seen for each species that the proteins found in the culture filtrate exhibit a strikingly different profile from that of their respective wall fractions. Similarly, there are distinct subsets of proteins extracted by CaCl2, CDTA, DTT, NaCl, and borate (Fig. 1, A-E). Individual extractions, hereafter referred to as non-sequential extractions, carried out with the same series of reagents on the Arabidopsis and tomato cells yielded different subsets of proteins to those acquired through sequential extraction (Fig. 2, A and B). Since there is clearly a difference between the pattern of proteins extracted using sequential and non-sequential extraction, both approaches were used to maximize sequence data for these two species. It is obvious that within the complex three-dimensional matrix that represents the cell wall there must be some restriction on the dynamic state of the wall or else one would not observe differential extraction.


Fig. 1. SDS-PAGE analysis of the extracellular and cell wall proteins sequentially extracted from intact suspension cultured cells of Arabidopsis (A), carrot (B), French bean (C), tomato (D), and tobacco (E). For each species the proteins were extracted by stirring whole cells in the requisite solution for 30 min beginning with 0.2 M CaCl2, then sequentially with 50 mM CDTA, 2 mM DTT, M NaCl, and finally 0.2 M sodium borate, pH 7.5. Cross-contamination was prevented by rigorously washing the whole cells with dH2O before and after each extraction. Each extract including the culture filtrate was extensively dialyzed against dH2O and lyophilized before SDS-PAGE was carried out on 10% (w/v) discontinuous gels, which were stained with Coomassie Brilliant Blue. Lane 1, molecular weight markers as indicated; lane 2, culture filtrate proteins; lane 3, proteins extracted with CaCl2; lane 4, CDTA-extracted proteins; lane 5, DTT-extracted proteins; lane 6, NaCl-extracted proteins; lane 7, borate-extracted proteins; lane 8, molecular weight markers as indicated. Bands that are labeled are those that were subjected to protein sequencing.
[View Larger Version of this Image (67K GIF file)]


Fig. 2. SDS-PAGE analysis of the cell wall proteins extracted non-sequentially from intact suspension cultured cells of Arabidopsis (A) and tomato (B). The proteins were extracted by stirring whole cells in the requisite solution for 30 min. The extractants were 0.2 M CaCl2, 50 mM CDTA, 2 mM DTT, 1 M NaCl, and 0.2 M sodium borate, pH 7.5. Contamination from the culture filtrate proteins was prevented by rigorously washing the cells with dH2O beforehand. The extracts were extensively dialyzed against dH2O and lyophilized before SDS-PAGE was carried out on 10% (w/v) discontinuous gels. Visualization was by Coomassie. Lane 1, molecular weight markers as indicated; lane 2, contains the proteins extracted with CaCl2; lane 3, CDTA-extracted proteins; lane 4, DTT-extracted proteins; lane 5, NaCl-extracted proteins; lane 6, borate-extracted proteins; lane 7, molecular weight markers as indicated. Bands that are labeled are those that were subjected to protein sequencing.
[View Larger Version of this Image (55K GIF file)]

Systematic Sequencing

A summary of the proteins from each species for which N-terminal sequencing was attempted is shown in Table I. The N-terminal sequences that were obtained, along with the amino acid yield in the first sequencing cycle and the corresponding protein molecular weights, are listed in Table II.

Table I. A summary of the number of N-terminal protein sequences attempted and determined from the individual plant species


Species Total no. of proteins for which sequencing was attempted No. of proteins which gave sequence data No. of unique sequences obtained No. of sequences unclassified by homology or function

Arabidopsis 86 47  (55%) 31 19
Carrot 16 10  (62%) 10 9
French bean 26 21  (81%) 17 13
Tomato 78 46  (59%) 30 18
Tobacco 27 22  (81%) 20 15
Total 233 146  (63%) 108 74

Table II. The N-terminal amino acid sequences obtained from cell wall and extra cellular proteins from suspension cultured cells of Arabidopsis, tomato, tobacco, French bean, and carrot

Proteins were subjected to SDS/PAGE and blotted onto PVDF paper and stained with Coomassie Brilliant Blue. Protein bands, the annotation of which is represented in Figs. 1 and 2, were excised and subjected to sequencing. Only those sequences that yielded more than 4 residues are shown. Sequence similarities with the degree of identity, displayed in parentheses, are shown.

Band Accession no. Mr Initial level (pmol) Sequence Sequence similarity

Sequentially extracted Arabidopsis proteins
Culture B [GenBank] 70 28 TTRTPLFLGLDEHTADLXFE Arabidopsis subtilisin like protease (100%) (100%)a
Filtrate C [GenBank] 65 30 EDRTY
D [GenBank] 60 5 KGVNDGT
E [GenBank] 54 13 KVPVDDQFRRVNNGGATDTR Carrot glycoprotein (75%) (75%)a
F [GenBank] 52 12 EPFIGVNYGQVADNLP Wheat beta -1,3-glucanase (69%) (69%)a
G [GenBank] 42 10 EYFIGVN Wheat beta -1,3-glucanase (57%)
H [GenBank] 34 12 EQDRR
I [GenBank] 31 20 IALTV
J [GenBank] 30 15 NFQRDVEITWGDMRR Arabidopsis xyloglucan endotransglycosidase (87%) (87%)a
K [GenBank] 25 43 ASSSSEDFDFFYFVQQGXP Arabidopsis extracellular ribonuclease (84%) (84%)a
L [GenBank] 23 16 ASSSSEDFDFFY Arabidopsis extracellular ribonuclease (100%)
CaC2 A [GenBank] 96 12 AVREYHWFVE
D [GenBank] 60 30 NPNYKEALSK Arabidopsis cellulase (100%)
E [GenBank] 59 30 KVPVDDQFRR Carrot glycoprotein (100%)
F [GenBank] 54 40 KVPVD Carrot glycoprotein (100%)
[GenBank] 30 NPNYK Arabidopsis cellulase (100%)
G [GenBank] 52 20 ATLTVFFRDN
I [GenBank] 44 5 HLKYKDPEQG
K [GenBank] 36 30 ADRELHRSKA
O [GenBank] 30 39 NPNYKEALSKSLLFFQGQRR Arabidopsis cellulase (95%) (95%)b
Q [GenBank] 25 100 RIPGIYSGGAWQNAHATFYG Arabidopsis expansin (75%) (90%)a,b
R [GenBank] 23 57 IPCRKAIDVPFGXRYVVXTW Arabidopsis xyloglucan endotransglycosidase (65%) (85%)a,b
S [GenBank] 18 13 ADLTR
T [GenBank] 17 30 ADREPNHFVA
CDTA A [GenBank] 55 10 EATVDMPLD
D [GenBank] 36 8 NPNY Arabidopsis cellulase (100%)
DTT B [GenBank] 45 30 ARKFF Arabidopsis triose-phosphate isomerase (100%)
NaCl H [GenBank] 22 24 ARKFFVG Arabidopsis triose-phosphate isomerase (100%)

Non-sequentially extracted Arabidopsis proteins

CDTA A [GenBank] 72 9 KVPV Carrot glycoprotein (100%)
B [GenBank] 69 8 KVPVDDQFR Carrot glycoprotein (100%)
I [GenBank] 39 16 KDLXHRDDKT
J [GenBank] 36 16 KDLXHRDDKT
K [GenBank] 33 18 IPCRKAIDVVF Arabidopsis xyloglucan endotransglycosidase (82%)
DTT A [GenBank] 68 14 AVPPRYGYTRG
B [GenBank] 64 8 MREIEHIPPP
F [GenBank] 36 70 ARKFFVGRNWPEL Arabidopsis triose-phosphate isomerase (62%) (69%)a,b
G [GenBank] 30 20 ARKFFV Arabidopsis triose-phosphate isomerase (100%)
H [GenBank] 25 10 VLTIYA
NaCl A [GenBank] 62 5 YKTIGKGYR Mouse T cell receptor (55%)
B [GenBank] 60 10 DVGKFK
D [GenBank] 48 9 DNPSSTPPLR
E [GenBank] 46 25 NPNYKEALSKSLLFFQGQRR Arabidopsis cellulase (95%)
F 38 43 APXSEGY
  K  TVRF
G [GenBank] 36 10 EDLPEK
H [GenBank] 33 20 APQEPNQFQLLKYH Maize cytochrome P450 (50%) (75%)a
I [GenBank] 31 12 SDRELHRSKAAYFF
J [GenBank] 27 10 VDTSRLFLTVVNNPPTVV Arabidopsis hypothetical protein (67%)
K [GenBank] 25 70 RIPGIYSGGAWQNAHATFYG Arabidopsis expansin (65%)

Sequentially extracted carrot proteins

CaC2 A [GenBank] 68 6 EPPYRLVDN
B [GenBank] 66 6 GPLNAQHQS
C [GenBank] 58 10 QLAELKYVI
D [GenBank] 46 7 DLSNLLSRVPNERSN
E [GenBank] 43 9 GVREDTYPDVVXTA
F [GenBank] 32 8 AEYPNDVNLTVYWDP
G [GenBank] 30 19 SEVGALVFQPKTRF
H [GenBank] 24 8 AHSDAVTPLPARSKV Human genomic sequence (69%)
CDTA C [GenBank] 56 5 SQEDTPL
E [GenBank] 30 5 ATNPSGQ

Sequentially extracted French bean proteins
Culture A [GenBank] 41 20 NYDKPOVEKPOVYKPOVEKPOVY Proline-rich protein (contains 3 repeating blocks of 4 amino acids)
Filtrate B [GenBank] 36 3 NYDKPOVEKP Proline-rich protein
CaCl2 A [GenBank] 230 8 NMYLPOVOOOOVVPTF
B [GenBank] 140 48 NHYSYSSOOOOOVVSS Extensin
C [GenBank] 136 10 NYDKPOVEKPOVYK Proline-rich protein (see bean culture filtrate, A)
D [GenBank] 84 8 NYDKPOVEKPOVYKP Proline-rich protein (see bean culture filtrate, A)
E [GenBank] 65 7 EDPVRFNLG
F [GenBank] 60 8 EDAYKFTTW
G [GenBank] 53 10 VAGRSVVKIAEGYL
H [GenBank] 46 5 KPDPEAVLIV
I [GenBank] 45 7 NSKPPEALILVKXSQ
J [GenBank] 44 10 SHDKPDHIRLFELKKDDLLISVHNA
K [GenBank] 43 24 YDKKVDSIILFGVNG
L [GenBank] 42 30 NYDKPOVEKPOVYKPOVEKPOVYKP Proline-rich protein (see bean culture filtrate, A)
M [GenBank] 35 12 ELPVNFYALNLTADNINIGY
O [GenBank] 33 10 NYDKNFYEDTLP
P [GenBank] 26 5 EYPVVFVKGLFFGKG
Q [GenBank] 22 25 QNQPPDFANOFIIPQNAA
CDTA C [GenBank] 36 100 DVNGGGHTLPQPLYQTTVVL Desulfovibrio desulfuricans periplasmic Fe hydrogenase (75%)
D [GenBank] 33 20 AGVDPAIPAYVKTNG
E [GenBank] 30 10 MGQGAVEGQLFYNVQ Pseudomonas fluourescens protein F (80%)

Sequentially extracted tomato proteins

Culture A [GenBank] 65 23 EGKAIGLAKPRMDST
Filtrate D [GenBank] 35 20 EQFDEEFDIT
E [GenBank] 31 200 EQXGSQAGGALRAGL French bean chitinase (73%)
G [GenBank] 25 8 STDFDFNN
[GenBank] 14 AKDFDFFYFVQQWP Tomato extracellular ribonuclease (100%)
H [GenBank] 24 173 AKDFDFFYFVQQWPGGYYDTPKQPKQ Tomato extracellular ribonuclease (69%) (65%)a,b
Iac [GenBank] 23 112 AKDFD Tomato extracellular ribonuclease (100%)
Ib [GenBank] 22 30 KSTDFDYNNKKANYD
Ic [GenBank] 21 15 SNAVAVLNXXEXM
CaC2 D [GenBank] 64 10 ANAKVPSH
7 KT  PPR  PSH
E [GenBank] 62 6 EVPLDDTGL
F [GenBank] 50 8 EVLYIPVTTDA Human Ig heavy chain DJ region (64%)
G [GenBank] 44 18 VAGKSFVPIAAGRQ Tobacco P7 curled leaf protein (64%)
H [GenBank] 40 13 VAGK Tobacco P7 curled leaf protein (100%)
I [GenBank] 39 5 SPVEGGPXGXL
J [GenBank] 35 8 EQXGRQRQGG French bean chitinase (70%)
K [GenBank] 34 9 EQXGSQA French bean chitinase (86%)
L [GenBank] 33 7 EQXGSQA French bean chitinase (86%)
O [GenBank] 28 15 ADREP
35 ALVED
P [GenBank] 27 18 TGVNYGQLGNNLP Tobacco beta -1,3-glucanase (77%) (85%)a
Q [GenBank] 23 100 TNPNFILTL Tomato osmotin (100%)
CDTA E [GenBank] 34 10 EQXGS French bean chitinase (100%)
F [GenBank] 31 18 EQXGS French bean chitinase (100%)
G [GenBank] 23 8 TNPNF Tomato osmotin (100%)
NaC E [GenBank] 34 38 EQXGSQAGGA French bean chitinase (90%)
F [GenBank] 30 10 SNPNFILTLV Tomato osmotin (90%)
G [GenBank] 23 8 ANPEVRNNLP

Non-sequentially extracted tomato proteins

DTT A [GenBank] 62 3 MEKGYYDLES
C [GenBank] 40 18 ANDPDFPYTVQANRP
E [GenBank] 35 20 EQXGSQ French bean chitinase (100%)
F [GenBank] 22 8 MNIPPGD
NaC A [GenBank] 80 30 STHTSDFLKL
C [GenBank] 76 20 STRTPEFLGLDNQCGVWA Tomato subtilisin protease (50%)
E [GenBank] 66 21 GYMKYKDPKQPLLGRRXD Barley beta -exoglucanase (55%) (61%)b
G [GenBank] 64 65 ANAKVPSHTISNPF
H [GenBank] 49 22 EVLYI Human Ig heavy chain DJ region (60%)
I [GenBank] 47 15 VAGK Tobacco P7 curled leaf protein (100%)
J [GenBank] 44 60 VAGKSFVPIALGRQSKQTPF Tobacco P7 curled leaf protein (55%)
K [GenBank] 40 60 GPVEIYYLQSADAKG
L [GenBank] 38 25 VKIGTYELLKGDFSV
[GenBank] 38 ELQLNYYTKSXXRAE Tomato peroxidase (72%)
M [GenBank] 35 12 ELQLNYYTKSWXRAE Tomato peroxidase (72%)
O [GenBank] 23 12 ALVEDPQMQKYHKH
Borate B [GenBank] 44 10 VAGKSFV Tobacco P7 curled leaf protein (100%)
C [GenBank] 42 4 HPVEI
D [GenBank] 40 8 ELQLNY Tomato peroxidase (67%)
H [GenBank] 23 14 SNPNFILTLVNNVPYTIWPA Tomato osmotin (90%) (70%)b
I [GenBank] 20 5 EYIPFIHEWV

Sequentially extracted tobacco proteins

Culture A [GenBank] 70 6 GLVPPADKY
Filtrate B [GenBank] 41 31 AVNGGPATLPEYQI Human 40-kDa urinary tract integral stone protein (64%)
C [GenBank] 34 10 AHVEVPNSLY
CaCl2 A [GenBank] 67 11 STIEVRNNSPYYSVD Tobacco osmotin (60%)
B [GenBank] 66 9 VPPAVWNSXNYNS
C [GenBank] 57 10 GEQPGDQARGARNPXGNN Tobacco chitinase (55%) (62%)b
D [GenBank] 56 7 QDPYVDFLK
E [GenBank] 52 8 AQPPQQADFL
F [GenBank] 50 5 FYAGLILTLVNTFPYNISPASS Tobacco protein P10 (64%) (59%)b
G [GenBank] 48 9 QYVKDPDKQVVARIFLDLQLVQR
H [GenBank] 47 40 ATIEVYNILPYYYYVSKAWSWNGN
I [GenBank] 46 8 GEQPGDQARGARPXGNN Tobacco chitinase (47%)
J [GenBank] 45 8 QDAYRFLXTHTYG
K [GenBank] 40 5 QPEESVFFA
L [GenBank] 34 38 WPXAQIFSAVRGXVN
M [GenBank] 33 7 EQCQDMAGGR French bean chitinase (72%)
N [GenBank] 28 11 IWVGISYKIHSLYFQ
O [GenBank] 27 28 GYPRKXVDVFTFTN
P [GenBank] 15 10 TIEEVLNLPPYVVAA
Q [GenBank] 14 12 AVFVILTNVYT
CDTA B [GenBank] 46 3 GPEEWVK
E [GenBank] 33 5 EQCQDMAGGAR French bean chitinase (72%)

a Sequences which have homologies to Arabidopsis ESTs.
b Sequences which have homologies to ESTs from species other than Arabidopsis.
c This band could be separated into 4 more by increasing the strength of the resolving gel from 10 to 14% (w/v) acrylamide.

Arabidopsis Cell Wall Proteins

From the 86 proteins selected for sequencing from the Arabidopsis extracts (Fig. 1A), 47 sequences were obtained, two of which yielded double sequences: band F in the CaCl2 extract and band F in the non-sequential NaCl extract. In the case of band F from the CaCl2 extract, it was possible to decipher the double sequence into separate sequences. This was performed by a process of subtraction since both the sequences within the CaCl2 band F also appear as single sequences at other molecular weights within the same CaCl2 extract, i.e. CaCl2 band E and CaCl2 band O. Discounting sequences that were present at either more than one molecular weight or were within other fractions left 31 unique sequences, which were found within the Arabidopsis wall extracts. The presumptive glycoprotein sequence starting "KVPV" is present in several extracts at more than one molecular weight (culture filtrate, band E; CaCl2, bands E and F; non-sequential CDTA, bands A and B) and may represent differential glycosylation. Evidence for microheterogeneity in primary sequence can be seen in bands F and G of the Arabidopsis culture filtrate proteins, which contained identical sequences except for the tyrosine or proline at position two. Bands K and L of the culture filtrate were found to contain identical sequences beginning with "ASSS." The sequence beginning "NPNY" was found in the CaCl2 extract, bands F, D, and O; the sequential CDTA extract, band D; and the non-sequential NaCl extract, band E. The sequence beginning with "IPCR" was found in the CaCl2 extract, band R and the non-sequential CDTA extract, band K. The sequence beginning "RIPG" was found within the CaCl extract, band Q and the non-sequential NaCl extract, band K. The sequence beginning with "ARKF" was seen in the sequential DTT and NaCl extracts, bands B and H, respectively, and also within bands F and G of the non-sequential DTT extract. Bands I and J from the non-sequential CDTA extract were also found to contain the same sequence beginning with "KDLX."

Carrot Cell Wall Proteins

All of the 10 carrot sequences were unique to carrot and were different from each other.

French Bean Cell Wall proteins

From the 26 French bean proteins that were processed for sequencing (Fig. 1C), 21 sequences were obtained, of which six shared the same N-terminal four amino acids: "NYDK." These sequences (culture filtrate, bands A and B; CaCl2, bands C, D, L, and O) were identical apart from the CaCl2 extract's band O, where the sequence diverged substantially after the fourth amino acid. All of the other bean sequences were only seen in this species and were not represented at multiple molecular weights. Not counting sequences that appeared more than once left a total of 17 unique sequences in the bean extracts.

Tomato Cell Wall Proteins

Seventy-eight proteins were selected for sequencing from the tomato extracts (Fig. 1D), of which 46 yielded sequence information. Four of these proteins contained double sequences: culture filtrate band G, CaCl2 bands D and O, and the non-sequential NaCl band L, all of which were deciphered into two individual sequences. This was possible by subtraction, in a similar way to deciphering the Arabidopsis double sequences. From the 46 sequences obtained, 30 were unique. The shortfall is due to a total of nine sequences, which were found at more than one position, or extract. Bands containing the sequence beginning "AKDF" were seen in the culture filtrate as bands G, H, and Ia (band I could be resolved into four more components by increasing the polyacrylamide concentration from 10 to 14% in the electrophoresis separating gel). The sequence beginning "KSTD" was also seen in band Ib of the culture filtrate and as the second component of band G of the same extract except that the N-terminal residue Lys was absent in the latter case. The sequence beginning "EQXG" was seen a total of eight times: culture filtrate, band E; CaCl2, bands J, K, and L; the sequential CDTA extract, bands E and F; the sequential NaCl extract, band E; and the non-sequential DTT extract, band E. However, the CaCl2 band J sequence was only the same up to the sixth residue, after which it diverged compared with culture filtrate band E. The sequence beginning "ANAK" appeared twice: once in the non-sequential NaCl extract, band G, and as one of the two short sequences that belong to CaCl2 band D. Bands containing the sequence beginning with "VAGK" were CaCl2, bands G and H; non-sequential NaCl extract, bands I and J; and band B of the non-sequential borate extract. These sequences were identical except residue 11, which was Ala in band G of the CaCl2 extract and Leu in band J of the non-sequential NaCl extract. Sequences beginning with either "SNPN" or as "TNPN" appeared four times: twice beginning with TNPN, band Q in the CaCl2 extract and as band G in the sequential CDTA extract, and twice beginning with SNPN as band F in the sequential NaCl extract and as band H in the non-sequential borate extract. Apart from this heterogeneity at the N terminus, all four sequences were identical. The sequence beginning "ALVE" appeared as band O in the non-sequential NaCl extract and as one of the two short sequences that were identified from band O of the CaCl2 extract. Incidentally, the second of the short sequences from band O of the tomato CaCl2 extract (beginning "ADRE") was also noted within the Arabidopsis CaCl2 extract, band T. Bands containing the sequence beginning with "EVLY" were seen in the CaCl2 extract, band F and in the non-sequential NaCl extract, band H. Finally, the sequence beginning with "ELQL" was seen in the non-sequential NaCl extract within bands L, M, and the non-sequential borate band D.

Tobacco Cell Wall Proteins

Twenty-seven of the protein bands from the tobacco extracts were selected for sequencing (Fig. 1E). Only two of the sequences were found to be either in more than one of the tobacco extracts or at more than one molecular weight within the same extract. The first begins with "GEQP" and was in the CaCl2 extract as band C and band I. The sequence from band C was 18 amino acids long and that of band I, 17 amino acids long; both were identical, except in the case of band C's sequence, which had an extra Asn present at amino acid position 13. The second tobacco sequence to appear in two different extracts began with "EQCQ" and was found within the CaCl2 extract, band M and the CDTA extract, band E. All the other tobacco sequences are listed as they appear in Table II. Discounting any sequences that appeared more than once in these extracts left a total of 20 unique tobacco sequences.

Overall Series of Sequencing Cell Wall Proteins

On average 63% of the proteins from all of the plant species proved to be sequenceable, but these were only from the bands selected for analysis that were generated in this study. Generally the culture filtrate proteins yielded a high success rate in terms of the numbers of proteins that were sequenced: 92%, 11 out of 12, for Arabidopsis; 67%, 8 out of 12, for tomato; 100% for bean and tobacco: two and three proteins, respectively. In contrast, the two culture filtrate proteins selected from the carrot culture filtrate did not sequence. Those proteins extracted with CaCl2 also gave a high success rate when sequenced: 60%, 12 out of 20, for Arabidopsis; 100%, eight out of eight for carrot; 94%, 16 out of 17, for French bean; 71%, 12 out of 17, for tomato and 100%, 17 out of 17, for tobacco. In those extracts that were made sequentially after the initial CaCl2 wash, the number of proteins that were successfully sequenced from those selected dropped off significantly. For example, in the sequential CDTA extracts 40%, two out of five, for Arabidopsis; 43%, three out of seven, for tomato; 40%, two out of five, for carrot, 60%, three out of five, for bean and 33%, two out of six, for tobacco were sequenceable. In the subsequent sequential extracts where protein bands were targeted for sequencing, considerably less proteins were amenable to this type of analysis. It was because of this continuing decline in successfully obtaining sequences that paralleled each step in the sequential extraction that none of the sequential borate proteins were selected for sequencing.

Homology Searches

The identities given for each sequence similarity listed in Table II can be defined as the percentage of amino acid matches for that query sequence against a sequence found in a particular data base. It should be noted that this figure does not take into account any conservative substitution. Throughout this study there have been numerous examples of a single sequence that was either present at more than one molecular weight within the same extract, or was present in more than one extract. It was therefore general practice to terminate sequence runs after the first four or five sequencing reaction cycles if this was observed. However, it is clear that microheterogeneity is occasionally seen between almost identical sequences, especially when longer sequences were obtained.

In the case of Arabidopsis, the classic model plant, which is a Brassica (an order that also includes plants such as oil seed rape and cauliflower), the proteins that could be identified were a subtilisin-like protease, beta -1,3-glucanase, two xyloglucan endotransglycosidases, an extracellular ribonuclease, cellulase, a carrot-like glycoprotein, an Arabidopsis hypothetical protein, and an expansin. Those that showed lower sequence similarities, with less than 65% identities, were a cytochrome P-450, triose-phosphate isomerase, and a mouse T-cell receptor. Nineteen proteins did not show any similarity to any member of the data bases searched.

The member of the Umbelliferae, carrot, was included because it has been used to mimic elongation growth and embryogenesis. Only one of the 10 sequences obtained from the carrot wall fractions bore any similarity to data base sequences, unlikely as it may seem, which corresponded to an unknown genomic DNA sequence from human.

The legume, French bean, has been used to model both differentiation and elicitor-induced pathogen stress (2). Identified wall proteins include extensin and a hybrid proline-rich, cysteine-rich chitin-binding protein (2). The bean culture was the only one in this study that led to the identification of hydroxyproline-rich glycoproteins. There were also two proteins that had a degree of similarity to two bacterial proteins, one of which was an iron hydrogenase. Thirteen of the French bean proteins were unlike any members present within the data bases searched.

For the two model solanaceous species, tobacco and tomato, there were several proteins, common to both species, that could be identified in similarity searches and based on these results may therefore have similar functions. For example, both species appeared to have forms of osmotin, chitinase, and pathogen-related proteins: tobacco P7 curled leaf protein in the case of tomato extracts and tobacco protein P10 in the case of the tobacco extracts. The tomato extracts were also found to contain proteins that bore similarity to other known extracellular proteins, which were an extracellular ribonuclease, beta -1,3-glucanase, subtilisin-like protease, beta -exoglucanase, and a peroxidase. One of the other tomato proteins that was found to have an analogue within the data bases was a human IgG heavy chain fragment. Altogether, 18 of the tomato proteins could not be identified by data base searches. Whereas the tomato fractions only contained a single form of chitinase, similar to one from French bean, the tobacco extracts were found to have two forms, one resembling a French bean chitinase and the other a tobacco chitinase. The tobacco extracts also contained a protein that bore similarity to an integral stone protein associated with the human urinary tract. Altogether, 15 of the tobacco proteins could not be identified by data base searches.


DISCUSSION

The proteins of the plant extracellular matrix comprise a subset that contains both structural proteins and enzymes. Many of these have been identified at the protein level, but a large number have also been characterized from gene sequences on the basis of a repeat motif or a secretory leader sequence. In comparison to present knowledge of the proteins of some metabolic pathways such as the Calvin cycle, shikimic acid, and lignin biosynthetic pathways, the components of which have been completely cloned, the proteins and their cognate genes of the extracellular matrix are poorly characterized. This is probably due in part to the relative inaccessibility of wall proteins in the plant. Additionally, differentiation of the cells, which in part resides in profound changes in wall structure, and the extent to which individual proteins are immobilized within these structures increases complexity. One approach to increase knowledge of the range of wall proteins is to use tissue-cultured cells, which can be grown in bulk to allow characterization of proteins of walls that resemble primary walls. These cultures can also be stimulated to mimic developmental processes such as elongation growth and embryogenesis in carrot cells or xylogenesis and secondary wall formation as in Zinnia or French bean (24). A wide range of cells have also been used to mimic aspects of pathogen stress using fungal elicitors. A distinct advantage of cell suspensions is that they can be subjected to washing regimes that elute non-covalently bound proteins without disrupting the cells. Our studies demonstrate, by comparing the plant species, that (i) patterns of proteins revealed by SDS-PAGE are strikingly different, (ii) a large number of these proteins that were sequenced cannot be identified by data base searches, and (iii) the major proteins that can be identified belong to very different classes of proteins. It appears that aspects of speciation resides in the complement of extracellular and cell wall proteins. Although it is difficult to discriminate between an extracellular protein and a cell wall protein in planta, this is not the case with suspension-cultured cells and is exemplified since most of the protein sequences found within the culture filtrates were absent from the subsequent salt-eluted extracts. Moreover, the culture filtrate profiles for each individual species was different to any of their respective wall extracts (Fig. 1, A-E).

The use of tissue cultures to model structures and processes in the whole plant has been validated by a large number of studies. These show that purification of proteins and cloning of cDNAs that originate through this route give rise to antibodies and cDNAs that can be used on the whole plant and locate to the predicted tissues and cells. Examples include structural proteins such as glycine-rich proteins (25), hydroxyproline-rich glycoproteins (5), proline-rich proteins (26), and enzymes such as peroxidases (23), chitinases, and laccases (27). Analyses of the sequences generated in this study validate the methodology, since many are known wall proteins and the cells remain intact. Only two potential cytoplasmic proteins have been revealed, a triose-phosphate isomerase and a P-450, of which their degree of identity was relatively low. Indeed, the types of extracellular proteins we have come across and identified through homology searches encompass the carbohydrate-modifying enzymes such as xyloglucan endotransglycosidases, cellulases, and glucanases; other examples of wall proteins are the expansin, peroxidase, protease, ribonuclease, chitinases, extensin, and proline-rich protein.

There are a number of reasons why certain known wall proteins may be absent from this study. The most conspicuous example is perhaps members of the hydroxyproline-rich glycoprotein family, only two of which were detected in the bean extracts. Their absence could be because they are present as minor components of the wall and consequently not detected here, or that they are only expressed in response to a particular environmental or developmental response that is not present within the culture regimes (28). The proteins could be N-terminally blocked, or it may also be that they are not readily extracted by these reagents, or that the more heavily glycosylated types are not stained with the Coomassie dye. It would seem that many of the enzyme activities associated with the wall are only known through function so may not yet be present within the data bases and are still waiting to be characterized by purification and subsequent sequencing.

For certain sequences within each species, there also existed a certain degree of heterogeneity in terms of the occasional amino acid substitution and also their appearance at different molecular weights. The former may be explained by the proteins originating from different genes and the latter by post-translational mechanisms such as glycosylation.

Cloning of any of the protein sequences reported here will undoubtedly be accelerated by the ever increasing numbers of ESTs that are being characterized. One surprising outcome of this research is that the sequenced Arabidopsis proteins did not have a larger proportion of "hits" in the EST data base. This could be because the majority of ESTs do not represent full-length cDNAs. Greater exploitation of the EST data base would thus require extensive internal amino acid sequence data to be generated if this were the case. It is envisaged that, within the foreseeable future, most if not all of the genes in several plants will be available in public access data bases. However, it has also been estimated that the function of at least 70% of the Arabidopsis genes alone, at present, cannot be identified by sequence homology to genes or proteins that have already been designated a function (7, 8). Therefore, having prior knowledge relating to a gene, such as its expression pattern or the eventual location of the expressed protein, may help in elucidating function. This information should therefore complement new data from systematic DNA sequencing exercises in that it gives a direct localization of the protein identified in this way.


FOOTNOTES

*   The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Dagger    Supported by a BBSRC ROPA award.
   Supported in part by a grant from the BBSRC.
par    To whom correspondence should be addressed. Fax: 44-191-374-2417; E-mail: a.r.slabas{at}durham.ac.uk.
1   The abbreviations used are: EST, expressed sequence tag; CDTA, cyclohexane diaminotetraacetic acid; DTT, dithiothreitol; PAGE, polyacrylamide gel electrophoresis.

REFERENCES

  1. Fry, S. C. (1995) Annu. Rev. Plant Physiol. Mol. Biol. 46, 497-520 [CrossRef]
  2. Bolwell, G. P. (1993) Int. Rev. Cytol. 146, 261-3243
  3. Carpita, N. C., and Gibeaut, D. M. (1993) Plant J. 3, 1-3 [CrossRef][Medline] [Order article via Infotrieve]
  4. Roberts, K. (1990) Curr. Opin. Cell Biol. 2, 920-928 [Medline] [Order article via Infotrieve]
  5. Showalter, A. M. (1993) Plant Cell 5, 9-23 [Free Full Text]
  6. Kieliszewski, M. J., and Lamport, D. T. A. (1994) Plant J. 5, 157-172 [CrossRef][Medline] [Order article via Infotrieve]
  7. Newman, T., de Bruijn, F. J., Green, P., Keegstra, K., Kende, H., McIntosh, L., Ohlrogge, J., Raikhel, N., Somerville, S., Thomashow, M., Retzel, E., and Somerville, C. (1994) Plant Physiol. 106, 1241-1255 [Abstract/Free Full Text]
  8. Cook, R., Raynal, M., Laudie, M., Grellet, F., Delseny, M., Morris, P. C., Guerrier, D., Giraudat, J., Quigley, F., Claubalt, G., Li, Y. F., Mache, R., Krivitsky, M., Gy, I., Kreis, M., Lecharny, A., Parmentier, Y., Marbach, J., Fleck, J., Clement, B., Phillips, G., Herve, C., Barded, C., Tremousaygue, D., Lescure, B., Laccome, C., Roby, D., Jourjon, M. F., Chabrier, P., Charpenteau, J. L., Desprez, T., Amselem, J., Chiapello, H., and Hofte, H. (1996) Plant J. 9, 101-124 [CrossRef][Medline] [Order article via Infotrieve]
  9. Kreuger, M., and van Holst, G. J. (1993) Planta 189, 243-248
  10. Smith, C. J. S., Watson, C. F., Ray, J., Bird, C. R., Morris, P. C., Schuch, W., and Grierson, D. (1988) Nature 334, 724-726 [CrossRef]
  11. Elkin, Y., Edwards, R., Mavendad, M., Hedrick, S. A., Ribak, O., Dixon, R. A., and Lamb, C. J. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 9057-9061 [Abstract]
  12. May, M. J., and Leaver, C. J. (1993) Plant Physiol. 103, 621-627 [Abstract/Free Full Text]
  13. Memelink, J., Hoge, J. H. C., and Schiperoort, R. A. (1987) EMBO J. 6, 3579-3583
  14. Fry, S. C. (1988) The Growing Plant Cell Wall, Longman, Harlow, United Kingdom
  15. Dixon, R. A., and Lamb, C. J. (1979) Biochim. Biophys. Acta 586, 453-463 [Medline] [Order article via Infotrieve]
  16. Murashige, T., and Skoog, F. (1962) Physiol. Plant. 15, 473-479
  17. Laemmli, U. K. (1970) Nature 227, 680-685 [Medline] [Order article via Infotrieve]
  18. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) J. Mol. Biol. 215, 403-410 [CrossRef][Medline] [Order article via Infotrieve]
  19. Kamo, M., Kawakami, T., Miyatake, N., and Tsugita, A. (1995) Electrophoresis 16, 423-430 [Medline] [Order article via Infotrieve]
  20. Tsugita, A., Kawakami, T., Uchiyama, Y., Kamo, M., Miyatake, N., and Nozu, Y. (1994) Electrophoresis 15, 708-720 [Medline] [Order article via Infotrieve]
  21. Bauw, G., De Loose, M., Inze, D., Van Montagu, M., and vandeKerkove, J. (1988) Proc. Natl. Acad. Sci. U. S. A. 84, 4806-4810
  22. Bauw, G., Rasmussen, H. H., Van den Bulcke, M., Van Damme, J., Puype, M., Gesser, B., Celes, J. E., and Vanderkerchove, J. (1990) Electrophoresis 11, 528-536 [Medline] [Order article via Infotrieve]
  23. Smith, J. J., Muldoon, E. P., and Lamport, D. T. A. (1984) Phytochemistry 23, 1233-1239 [CrossRef]
  24. Bolwell, G. P., and Robertson, D. (1997) in Morphogenesis in Plant Tissue Culture (Soh, W. Y., ed), Kluwer Academic Publishers, Dordrecht, The Netherlands , in press
  25. Keller, B. (1993) Plant Physiol. 101, 1127-1130 [Free Full Text]
  26. Datta, K., Schmidt, A., and Marcus, A. (1989) Plant Cell 1, 945-952 [Abstract/Free Full Text]
  27. Faye, L., Johnston, K. D., Storm, A., and Chrispeels, M. J. (1989) Physiol. Plant 75, 309-314
  28. Miller, J. G., and Fry, S. C. (1992) Plant Cell Tissue Organ Cult. 31, 61-66

©1997 by The American Society for Biochemistry and Molecular Biology, Inc.