1Division of Biological Science, Graduate School of Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603 and Departments of 2Structural Biology and 3Computational Biology, Biomolecular Engineering Research Institute, 623 Furuedai, Suita, Osaka 565-0874, Japan
4 To whom correspondence should be addressed. E-mail: shirai{at}beri.or.jp
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords: evolutionary trace/PCNA/protein complex/protein interface/RFC
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The clamploader is a hetero-pentameric protein complex. The bacterial clamploader is a pentameric complex of one , one
' and three
subunits and the bacterial clamp molecule is a homo-dimer of ß-subunits. Bacteriophages have their own complex, the clamploader is composed of one gp62 subunit and four gp44 subunits and the clamp is a homo-trimer of gp45 subunits (Trakselis and Benkovic, 2001
). The eukaryal complex consists of one each of the replication factor C (RFC)15 (also called RFCAE or p140, p40, p36, p37 and p38) subunits (Uhlmann et al., 1996
). The archaeal complex has one large (RFCL) and four small (RFCS) subunits (Edgell and Doolittle, 1997
). The eukaryal and archaeal clamp molecules are homo-trimers of PCNA subunits. All of the clamploader subunits share AAA+ (or N-terminal and middle) and collar (or C-terminal) domains and are thought to be homologous to each other (Guenther et al., 1997
; Neuwald et al., 1999
).
The known architectures are conserved among all of the clamploaders from cellular organisms. The three-dimensional structures of bacterial (Escherichia coli) and eukaryal (yeast) clamploaders in complex with the cognate clamp molecule were solved by X-ray crystallography (Jeruzalmi et al., 2001; Bowman et al., 2004
) and that of an Archaea (Pyrococcus furiosus) was investigated by the single particle cryo-EM method (Miyata et al., 2004
). The structure of the P.furiosus RFC small subunit (PfuRFCS) alone was also determined by X-ray crystallography (Oyama et al., 2001
). The spatial arrangements of the five subunits and their interactions with the clamp molecule basically agree among the three complexes. The five subunits are arranged in a ring shape, in which the collar domains are related by a pseudo-five-fold rotation axis and the AAA+ domains are arranged in a spiral along the pseudo-rotation axis. The clamp molecule interacts with the spiral of the AAA+ domains (Jeruzalmi et al., 2001
; Bowman et al., 2004
).
The shared architecture and homology of the clamploaders imply a conserved molecular mechanism. On the other hand, the variations in the subunit compositions suggest functional differentiation among the subunits. The different complex versions for each life domain, based on a common architecture, provide an intriguing case to study the molecular evolution of protein machineries.
In this work, the evolutionary processes of archaeal and eukaryal clamploader complexes were investigated by using the evolutionary trace (ET) method. ET methods use protein phylogeny to identify functionally important residues. This is usually done by partitioning a molecular phylogenetic tree and search for the residues specifically conserved for a partitioned sequences, which are called as class-specific residues (Lichtarge et al., 1996; Lichtarge and Sowa, 2002
). The class-specific residues were shown to be important for function or structure of proteins in several cases (Lichtarge et al., 1996
; Shirai and Go, 1997
; Landgraf et al., 1999
; Innis et al., 2000
; Sowa et al., 2001
; Frenal et al., 2004
; Shackelford et al., 2004
; Zhu et al., 2004
). In this study, an ET method that directly uses inferred ancestral sequences (Shirai et al., 1997
) has been modified to identify the residues and patterns of replacement responsible for the quarterly structure evolution of the clamploaders and clamps.
The results demonstrated that the amino acid replacements during the differentiation process preferentially accumulated at the subunit interfaces. Also, the simpler subunit composition of the archaeal clamploader appeared to be a degenerated version of the eukaryal complex, rather than a preserved ancestral type.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The amino acid sequences of 60 clamploader subunits and 66 clamp subunits were obtained from the EMBL/Genbank/DDBJ (Release 57; Miyazaki et al., 2003) and SwissProt (Release 43; Bairoch et al., 2004
) databases. The sequences were aligned by using the ClustalW program (Higgins et al., 1994
) and the alignment was manually refined on the XCED alignment editor (Katoh et al., 2002
). In this study, the residue numbering systems in yeast RFC1-5 and PCNA were based on RFC1_YEAST, RFC2_YEAST, RFC3_YEAST, RFC4_YEAST, RFC5_YEAST and PCNA_YEAST (SwissProt entry codes), respectively. The numbering systems for P.furiosus RFCL, RFCS and PCNA were based on RFCL_PYRFU, RFCS_PYRFU and PCNA_PYRFU, respectively.
A molecular phylogeny was computed by the maximum likelihood method on the PAML application (Yang, 1997) with JTT score matrix (Jones et al., 1992
).
Evolutionary trace (ET)
Ancestral sequences at each node of the phylogenies were inferred by using the PAML application. PAML generated the most likely amino acid types (ancestral sequences) with their likelihoods at each node of phylogeny in the output. The ancestral sequences and the likelihood values were used to devise a score function to rank the inferred amino acid replacements in terms of their significance for the differentiation process. The function was formulated as:
![]() |
Homology modeling of the PfuRFCPCNA complex
A homology model of the P.furiosus clamploaderclamp (RFCPCNA) complex was constructed in order to allocate the amino acid replacements detected by the ET on the 3D structure. The structures of the P.furiosus clamp (PfuPCNA) and the small subunit (PfuRFCS) were previously determined by X-ray crystallography (Matsumiya et al., 2001; Oyama et al., 2001
). PfuRFCL was modeled from the crystal structure of yeast RFC1 by using the SwissModel application (Schwede et al., 2003
). These subunits were assembled by referring to the yeast complex structure (Bowman et al., 2004
). PfuPCNA was superposed on to PCNA of the yeast complex. The homology model of PfuRFCL was superposed on to yeast RFC1 and the four PfuRFCSs were superposed on to yeast RFC2-5, respectively. Domain orientations in the superposed subunits were manually corrected by fitting to the corresponding domains in the yeast complex. Atomic clashes in the assembled models were removed by repeated cycles of manual modeling, molecular dynamics simulation (1 ps, 300 K, in vacuo simulation by using an AMBER perm 96 force field on the InsightIIDiscover application; Accelrys) and an energy minimization.
In the final model, 79, 99 and 100% (cumulatively) of the main chain dihedral angles were found within the favored, allowed and additionally allowed regions of the Ramachandran plot, respectively (plot not shown). The root mean square deviations from the ideal bond geometries were 0.120.13 å for bond distances and 1.92.3° for bond angles in the final subunit models.
The amino acid replacements detected by ET were mapped on the crystal structure of the yeast complex and on the homology model of the P.furiosus complex, by referring to the sequence alignment. The residues in the models were categorized into interior, surface, interface or ligand sites. Interface sites had accessibility that was reduced by more than 20% in the complex relative to that in the isolated subunit. Ligand sites were in contact with a nucleotide or magnesium ion. Among the remaining residues, those with accessibilities of <0.5 were categorized as interior sites and the others were assigned to surface sites. An in-house program of the Lee and Richards method (Lee and Richards, 1971) was used to calculate the accessible surface area.
The difference in the high-Sijk site's distributions on the protein models between subunit differentiation and non-differentiation (speciation) processes was tested with a paired-sample t-test (Campbell, 1974). Two-sided t-value was calculated as:
![]() |
![]() |
Results and discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
One of the most interesting features of the clamploader is that the Bacteria, Archaea and Eukarya domains have employed different subunit compositions, in spite of the strongly conserved complex architecture (O'Donnell et al., 2001). To search for the origin of these subunit composition differences, a molecular phylogeny was constructed from the amino acid sequences (Figure 1; non-omitted version of phylogeny is available as Supplementary data at PEDS Online).
|
The constructed phylogeny suggested that the archaeal RFCS was not a direct descendant of the ancestral small subunit. If the archaeal clamploaders were preserved ancestral type (model P1 in Figure 1), then branch 6 (RFCS-cluster root) should be directly attached to branch 2 (small-subunit cluster-root). This could happen only if branches 2, 4 and 5 were discarded at the same time. Although these branches display relatively lower bootstrap probabilities, the probability of their simultaneous loss is less likely as compared with the other possible combinations that do not support a direct descent of RFCS.
The phylogeny suggested the following evolutional history of the clamploader complex. Since the bacterial and bacteriophage subunits did not intervene in this phylogeny, they (models B and V in Figure 1) should have diverged independently from the archaeal and eukaryal subunits. The archaeal and eukaryal subunits first bifurcated into large and small subunits. Although a pentameric complex of one large and four small subunits was expected at this stage (model P1 in Figure 1), the archaeal complex, which adopted the same subunit composition (model A in Figure 1), was not a direct descendant of this complex.
RFC5 next diverged from the small subunit lineage. The anticipated complex at this stage was composed of three types of subunit (model P2 in Figure 1), which would resemble the bacterial complex (model B), if the position of RFC5 relative to RFC1 was conserved. Then, the small subunit lineage diverged into RFC2, -3 and -4, which established the current eukaryal complex (model E in Figure 1).
The archaeal RFCS subunit diverged from the RFC4 lineage after the eukaryal type complex was established. Since no apparent isoform of the small subunits was found in the P.furiosus genome, the genes encoding RFC2, -3 and -5 must have been abolished. This implies that RFCS had to assume the roles of RFC25 in this process. This might be a reason for the preservation of RFC4/RFCS lineage in Archaea, that only RFC4 (p40 of human) has clamp-unloading activity by itself (Cai et al., 1997).
The PCNA (clamp) molecules of Eukarya and Archaea are homo-trimers. In the molecular phylogeny of PCNA, archaeal subunits (aPCNA in Figure 1) clustered against eukaryal ones (ePCNA in Figure 1; non-omitted version of phylogeny is available as Supplementary data at PEDS Online). The branch 15 in Figure 1, which separates the two domains, should correspond to the process of adaptation of the clamp molecules to each cognate clamploader complex.
Interfacial evolution in the subunit differentiation processes
The ancestral sequences at each node in the phylogeny were computed and were used to detect the replacements on each branch. Actually, both the amount and reliability of the replacements varied greatly among the branches. Also, certain neutral replacements complicated the analysis. Therefore, the score system (Sijk) was devised to rank the replacements by their significance. The Sijk score became higher when an anticipated replacement was more likely, brought a larger change in the amino acid properties and was more unique. The influence of an amino acid replacement on protein structure or function is expected to be larger when it occurred at relatively conservative residues and makes a larger change in chemical property of the side chain. Therefore, Sijk is thought to contrast important replacements against nearly neutral ones. Figure 2 shows the profile of an average fraction of sites against the score over all branches in the phylogenies in Figure 1 (including omitted branches). The Sijk values were generally smaller for the branches that were more distantly related to the current proteins, because inferred ancestral sequences were less likely for more ancient proteins. Hence the Sijk values are directly comparable between the sites on the same branch, but between the sites on different branches. Assuming an exponential distribution, the replacements with scores more than one standard deviation higher than the average (over the sites on the same branch) were selected for each branch to normalize the difference in amplitude of Sijk values between branches.
|
|
The differentiation of archaeal RFCS is the most intriguing process, because this subunit has been suggested to be a degenerated small subunit. In this process, RFCS had to regain the ability to occupy the positions of RFC25, by discarding the specific interactions obtained so far. Considering that combinations of any two of the current five subunits did not show activity (Uhlmann et al., 1996), interface rebuilding must have been necessary to make an active complex out of one large and four small subunits.
Branches 10 and 6 correspond to the archaeal RFCL and RFCS processes, respectively (Figure 1). The detected sites are shown on the homology model of the P.furiosus RCFPCNA complex (Figure 3d). The homology model suggested that RSs:170K (Sijk = 0.35) and LQs:323L (Sijk = 0.23) of RFCS established remarkable interactions in this process, which are still conserved in the current complex (Figure 3e and f). The positively charged side chain of R170 (RS170K) hydrogen bonds with the carbonyl groups at the C-terminal turn of the helix of the neighboring cognate subunit. This interaction should be favored by an electrostatic interaction between the positive charge of R170 (RSs:170K) and the helix dipole (Figure 3e). L323 (LQs:323L) hydrophobically interacts with V291 of the neighboring subunit (Figure 3f). Modifications were also observed in the RFCL subunit (Figure 3d). The sites RLs:282R (Sijk = 0.23) and MAs:306 M (Sijk = 0.45) were found between RFCL and RFCS (placed in the position of RFC4).
This differentiation process between the RFC systems should have been concurrent with the modification of PCNA (clamp) molecules (Figure 3g). Among the replacements on the branch separating archaeal and eukaryal PCNAs (branch 15 in Figure 1), FPp:125H (Sijk = 0.37) was found at the interface between PCNA and RFC. The aromatic side chain of the site is interacting with the alkyl-group of RFC5-K73 in the yeast complex (Figure 3h). CRp:81C (Sijk = 0.78) makes a hydrophobic interaction with IAp:147I (Sijk = 0.64) and LAp:151L (Sijk = 0.72) at the interface between two PCNA subunits (Figure 3i). Note that these replacements are indicated from the Archaea to the Eukarya direction and the interactions are shown on the model of yeast complex. Hence these hydrophobic interactions have been lost in establishing the archaeal complex.
ET as a tool for interface analysis
The above descriptions focused on the remarkable sites at the subunit interfaces. The ability of ET methods to detect protein interfaces has been demonstrated in several cases (Lichtarge and Sowa, 2002). Especially when an evolutionary process involved alternation in interaction partners of proteins, the interfaces between domains (Shirai and Go, 1997
) or between receptor and ligand (Landgraf et al., 1999
; Innis et al., 2000
; Sowa et al., 2001
; Frenal et al., 2004
; Zhu et al., 2004
) were detectable by ET methods.
The sites revealed by the method devised in this study appeared to be biased towards the interface or interior regions of the subunits. A total of 173 high-score sites (including those on differentiation processes among large or small subunits, which have not been described in this paper) were detected for the RFC and PCNA subunits; however, most of the sites were not visible on the surface of the complex (Figure 4a and b). This indicates the biased localization of the detected sites. Some of the residue patches at the subunit interfaces were observable when the complex was dissected (Figure 4c).
|
|
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Bowman,G.D., O'Donnell,M. and Kuriyan,J. (2004) Nature, 429, 724730.[CrossRef][ISI][Medline]
Cai,J., Gibbs,E., Uhlmann,F., Phillips,B., Yao,N., O'Donnell,M. and Hurwitz,J. (1997) J. Biol. Chem., 272, 1897418981.
Campbell,R.C. (1974) Statistics for Biologists, 2nd edn. Cambridge University Press, Cambridge.
Cann,I.K.O., Ishino,S., Yuasa,M., Daiyasu,H., Toh,H. and Ishino,Y. (2001) J. Bacteriol., 183, 26142623.
Edgell,D.R. and Doolittle,W.F. (1997) Cell, 89, 995998.[CrossRef][ISI][Medline]
Frenal,K., Xu,C.-Q., Wolff,N., Wecker,K., Gurrola,G.B., Zhu,S.-Y., Chi,C.-W., Possani,L.D., Tytgat,J. and Delepierre,M. (2004) Proteins, 56, 367375.[CrossRef][Medline]
Guenther,B., Onrust,R., Sali,A., O'Donnell,M. and Kuriyan,J. (1997) Cell, 94, 335345.
Higgins,D., Thompson,J., Gibson,T., Thompson,J.D., Higgins,D.G. and Gibson,T.J. (1994) Nucleic Acids Res. 22, 46734680.[Abstract]
Innis,C.A., Shi,J. and Blundell,T.L. (2000) Protein Eng., 13, 839847.[CrossRef][ISI][Medline]
Jeruzalmi,D., Yurieva,O., Zhao,Y., Young,M., Stewart,J., Hingorani,M., O'Donnell,M. and Kuriyan,J. (2001) Cell, 106, 417428.[CrossRef][ISI][Medline]
Jeruzalmi,D., O'Donnell,M. and Kuriyan,J. (2002) Curr. Opin. Struct. Biol., 12, 217224.[CrossRef][ISI][Medline]
Jones,D.T., Taylor,W.R. and Thornton,J.M. (1992) CABIOS, 8, 275282.[Medline]
Katoh,K., Misawa,K., Kuma,K. and Miyata, T. (2002) Nucleic Acids Res., 30, 30593066.
Landgraf,R., Fisher,D. and Eisenberg,D. (1999) Protein Eng., 12, 943951.[CrossRef][ISI][Medline]
Lee,B and Richards,F.M. (1971) J. Mol. Biol., 55, 379400.[CrossRef][ISI][Medline]
Lichtarge,O. and Sowa,M.E. (2002) Curr. Opin. Struct. Biol., 12, 2127.[CrossRef][ISI][Medline]
Lichtarge,O., Bourne,H.R. and Cohen,F.E. (1996) J. Mol. Biol., 257, 342358.[CrossRef][ISI][Medline]
Matsumiya,S., Ishino,Y. and Morikawa,K. (2001) Protein Sci., 10, 1723.
Miyata,T., Oyama,T., Mayanagi,K., Ishino,S., Ishino,Y. and Morikawa,K. (2004) Nat. Struct. Mol. Biol., 11, 632636.[CrossRef][ISI][Medline]
Miyazaki,S., Sugawara,H., Gojobori,T. and Tateno,Y. (2003) Nucleic Acids Res., 30, 1316.[ISI]
Neuwald,A.F., Aravind,L., Spouge,J.L. and Koonin,E.V. (1999) Genome Res., 9, 2743.
O'Donnell, Jeruzalmi,D. and Kuriyan,J. (2001) Curr. Biol., 11, R935R946.[CrossRef][ISI][Medline]
Oyama T., Ishino,Y., Cann,I.K.O., Ishino,S. and Morikawa,K. (2001) Mol. Cell, 8, 455463.[CrossRef][ISI][Medline]
Schwede T., Kopp J., Guex N. and Peitsch M.C. (2003) Nucleic Acids Res., 31, 33813385.
Shackelford,G.S., Regni,C.A. and Beamer,L.J. (2004) Protein Sci., 13, 21302138.
Shirai,T. and Go,M. (1997) J. Mol. Evol., 44, S155S162.[ISI][Medline]
Shirai,T., Suzuki,A., Yamane,T., Ashida,T., Kaobayashi,T., Hitomi,J. and Ito,S. (1997) Protein Eng., 10, 627634.[CrossRef][ISI][Medline]
Sowa,M.E., He,W., Slep,K.C., Kercher,M.A., Lichtarge,O. and Wensel,T.G. (2001) Nat. Struct. Biol., 8, 234237.[CrossRef][ISI][Medline]
Stillman,B. (1994) Cell, 78, 725728.[CrossRef][ISI][Medline]
Trakselis,M.A. and Benkovic,S. (2001) Structure, 9, 9991004.[CrossRef][ISI][Medline]
Tsurimoto,T. and Stillman,B. (1990) Proc. Natl Acad. Sci. USA, 87, 10231027.
Uhlmann,F., Cai,J., Flores-Rozas,H., Dean,F.B., Finkelstein,J., O'Donnel,M. and Hurwitz,J. (1996) Proc. Natl Acad. Sci. USA, 93, 65216526.
Yang,Z. (1997) CABIOS, 15, 555556.
Zhu,S., Huys,I., Dyason,K., Verdonck,F. and Tytgat,J. (2004) Proteins, 54, 361370.[CrossRef][ISI][Medline]
Received November 29, 2004; revised March 5, 2005; accepted March 5, 2005.
Edited by Haruki Nakamura
|