Similarity between the C-terminal domain of the prion protein and chimpanzee cytomegalovirus glycoprotein UL9

Igor B. Kuznetsov1 and S. Rackovsky

Department of Biomathematical Sciences, Mount Sinai School of Medicine, Box 1023, One Gustave L. Levy Place, New York, NY 10029, USA

1 To whom correspondence should be addressed at: Department of Molecular Biosciences, 2034 Haworth Hall, 1200 Sunnyside Avenue, The University of Kansas, Lawrence, KS 66045, USA. e-mail: igor{at}ku.edu


    Abstract
 Top
 Abstract
 Introduction
 Methods and results
 References
 
Prion diseases are a group of fatal neurodegenerative disorders associated with structural conversion of a normal, mostly {alpha}-helical cellular prion protein, PrPC, into a pathogenic ß-sheet-rich conformation, PrPSc. The structure of PrPC is well studied, whereas the insolubility of PrPSc makes the characterization of its structure problematic. No proteins similar to PrP, except for its paralog with the same fold, PrP-Doppel, are known. However, PrP-Doppel does not undergo a structural transition into a ß-sheet-rich conformation. Structural information from proteins that share a weak but significant sequence similarity with PrP may be used to gain additional insights into the conformation of PrPSc. We construct a sequence profile corresponding to the structured domain of PrP and use this profile to search the SWISS-PROT and TrEMBL databases. We identify a significant sequence similarity between PrP and chimpanzee cytomegalovirus glycoprotein UL9. This glycoprotein scores higher than all PrP-Doppel sequences. Fold recognition methods assign a mainly-ß fold to UL9. Owing to the observed sequence similarity with PrP and a putative mainly-ß fold, the UL9 glycoprotein may represent a potential target for experimental structure determination aimed at obtaining a structural template for PrPSc modeling.

Keywords: alignment/conformational transition/Doppel/sequence profile


    Introduction
 Top
 Abstract
 Introduction
 Methods and results
 References
 
Prion diseases are a class of fatal neurodegenerative disorders in mammals (CJD, BSE, scrapie, etc.). These diseases may be inherited or may arise sporadically, and are believed to be caused by a unique pathogen that contains no nucleic acid, the prion protein. The prion protein is a rare example of a protein that can exist, under physiological conditions, in two different conformations—the normal cellular protein with unknown function, designated PrPC, and the infectious pathogenic form, designated PrPSc. According to the ‘prion only’ hypothesis, the pathogenesis involves the initial formation, caused by a point mutation or some exogenous factors, of PrPSc which subsequently interacts with PrPC and converts it. The conformational transition PrPC->PrPSc involves unfolding of {alpha}-helices and formation of ß-sheets. This transition is not associated with any covalent modifications (Prusiner et al., 1998Go).

The cellular form of the prion protein is a GPI-anchored outer-membrane glycoprotein that undergoes rapid endocytosis (Lehmann et al., 1999Go). A number of NMR and X-ray studies aimed to detect the structure of PrPC have revealed that the C-terminal domain of the protein is structured, whereas the N-terminal domain, which contains Gly- and Pro-rich octarepeats, is highly flexible and cannot be assigned a particular conformation (Riek et al., 1998Go). Recently, a paralog of the prion protein, PrP-Doppel, was identified (Mo et al., 2001Go). This protein and the C-terminal domain of PrP share ~25% sequence identity and have very similar structures which consist of three {alpha}-helices (A, B and C) and a short ß-sheet. However, despite its structural similarity to PrP, Doppel does not undergo a structural transition into a ß-sheet-rich conformation (Nicholson et al., 2002Go).

Little is known about the pathogenic conformation of the prion protein, PrPSc, except for its approximate secondary structure content, protease resistance and the insolubility of some forms (Prusiner et al., 1998Go). Owing to the insolubility of PrPSc, characterization of its structure by NMR or X-ray crystallography has been problematic. A number of attempts to model the structure of PrPSc using spectroscopic and electron crystallography data have been undertaken (Huang et al., 1995Go; Wille et al., 2003Go). Improvement of the quality of such knowledge-based models, and progress in determining the structure of PrPSc, can be achieved by using information derived from proteins that share a weak but significant sequence similarity with PrP. The structural properties of such proteins, especially if they adopt a mainly-ß fold, may be used to gain insight into the conformation of PrPSc. Sequence profiles obtained from a multiple sequence alignment of related proteins represent one of the most sensitive methods used to detect structural similarity between proteins with a low degree of sequence identity (Gribskov and Veretnik, 1996Go). Our aim here is to use sequence profiles to identify proteins that share a significant sequence similarity with the structured C-terminal domain of the prion protein.


    Methods and results
 Top
 Abstract
 Introduction
 Methods and results
 References
 
We used sequence profile search software (Gribskov and Veretnik, 1996Go) from the GCG Package [Wisconsin Package Version 10; Accelrys (GCG), 9685 Scranton Road, San Diego, CA 92121, USA]. A multiple sequence alignment of PrP sequences corresponding to the structured C-terminal domain of the human PrP (residues 128–231) was used to construct the sequence profile. The disordered N-terminal part of PrP has very low sequence complexity and was excluded in order not to bias the results. We used 57 non-identical PrP sequences for which the complete sequence of the C-terminal domain is available for sequence profile construction (Table I). The multiple alignment was obtained by means of the GCG PILEUP program using the BLOSUM50 similarity matrix (Henikoff and Henikoff, 1992Go), a gap initiation penalty of –16 and a gap extension penalty of –4. The sequence profile was calculated using the GCG PROFILEMAKE program and the BLOSUM50 similarity matrix with the default logarithmic weighting. The profile was searched against the SP-TrEMBL database (release 22.0) (Bairoch and Apweiler, 2000Go) using the GCG PROFILESEARCH program with default local alignment options. The profile was able to identify all prion proteins as well as Doppel sequences among the highest scoring hits.


View this table:
[in this window]
[in a new window]
 
Table I. PrP profile alignment scores
 
The highest scoring (z-score = 11.76) non-PrP and non-Doppel sequence is the chimpanzee cytomegalovirus (CMV) glycoprotein UL9 (Table I). UL9 is the only non-PrP sequence which scores higher than all Doppel proteins. It should be noted that the prion protein and Doppel share the same fold and ~25% sequence identity. The next highest scoring non-PrP and non-Doppel sequence (TrEMBL accession No. Q9DFV7) has a z-score of 8.63, which is lower than the z-scores of all but one Doppel sequences. The alignment scores produced by the PROFILESEARCH program do not take into account compositional bias, which may result in an artificially high score for sequences with low complexity. Therefore, it is necessary to study the effect of the amino acid composition of UL9 on the z-score obtained from profile analysis. We generated 103 random sequences by shuffling the UL9 sequence and aligned these random sequences with the PrP profile using the GCG PROFILEGAP program. This procedure produced a distribution of 103 random scores with an average score of 22.266 ± 3.7688. The local alignment scores are known to follow the extreme value distribution (Pearson, 1998Go), which gives the probability of observing the score greater than or equal to x, P(S >= x):

The random scores were fitted to the extreme value distribution using the STATISTICA software package (Statistica version 6.0; StatSoft, Inc., 2300 East 14 St., Tulsa, OK 74104, USA), giving a = 20.5557 and b = 2.971 (Equation 1). The profile alignment score for UL9 is 53.17, and the probability of observing a score of this magnitude or larger in the sequences with the same amino acid composition and length as those of UL9 obtained using Equation 1 is very low, P(S >= 53.17) = 1.7x10–5. Therefore, we conclude that the observed sequence similarity between UL9 and prion protein is highly significant. The local alignment between UL9 and the prion protein profile comprises residues 67–172 of UL9 and is shown in Figure 1A. Pairwise local alignment of chicken PrP and UL9 shows that the best alignment comprises residues 80–130 of UL9, and helices A and B of PrP (Figure 1B). It should be noted that the loop connecting helices A and B is thought to participate in binding the hypothetical PrP ligand, protein X, which may be involved in conformational transition (Kaneko et al., 1997Go).



View larger version (27K):
[in this window]
[in a new window]
 
Fig. 1. (A) The local alignment between chimpanzee CMV glycoprotein UL9 and PrP(128–231) sequence profile (P). For each profile position the most frequent amino acid is shown. (B) The best local alignment between chicken PrP and UL9. Alignment score = 77, alignment length = 61 residues, 28.0% identity. BLOSUM50 similarity matrix, gap initiation penalty of –12, gap extension penalty of –2. PrP helices and strands are underlined. Amino acid matches are indicated by solid lines, conservative replacements are indicated by dots.

 
The CMV is a member of the herpesvirus group. It has been proposed as the most prevalent infectious agent causing neurological dysfunction in the developing brain, and therefore has a high affinity for developing brain cells (van den Pol et al., 2002Go). Normal cellular prion protein is also expressed in brain cells, and accumulation of abnormal, insoluble PrPSc leads to neurological dysfunctions as well. Since PrP is located on the outer membrane and undergoes endocytosis, it has been hypothesized that PrP may function as a receptor protein participating in signal transduction (Prusiner et al., 1998Go). The observed sequence similarity between the prion protein and CMV glycoprotein UL9 is likely to reflect structural similarities. However, the structure and function of UL9 are not known. In the absence of an obvious homologous template, potentially matching folds can be identified using fold recognition techniques.

We used the mGenThreader fold recognition server (McGuffin and Jones, 2003Go), which has been shown to have the lowest rate of false positive predictions among all automated fold recognition servers (Bujnicki et al., 2001Go), to make predictions for UL9 protein. It should be noted that all highest scoring templates (E-value from 0.03 to 0.06) belong to mainly-ß proteins involved in substrate binding: immunoglobulin antigen-binding domains (PDB i.d. 8fab, 12e8, 1a3l, 32c2, 1igt) and T-cell receptors (PDB i.d. 1tcr, 1hxm, 1bec). A different fold recognition method, SAM_T02 (Karplus et al., 2001Go), also assigns highest scoring hits for UL9 to immunoglobulin antigen-binding domains and T-cell receptors. The same two methods do not find any significant matches for the C-terminal domain of the prion protein, except for the match between PrP and Doppel. The evidence of a putative mainly-ß fold of the UL9 protein and its sequence similarity with the prion protein, which undergoes a conformational transition into mainly-ß conformation, identify UL9 as a potential target for experimental structure determination aimed at obtaining a template for modeling the structure of PrPSc. Further progress in structural and functional annotation of UL9 may help understand the function of PrP and what type of substrate it binds.


    Acknowledgements
 
This work was supported by grant number 1R01 LM06789 from the National Library of Medicine of the National Institutes of Health. I.B.K. is supported by NSF EPSCoR.


    References
 Top
 Abstract
 Introduction
 Methods and results
 References
 
Bairoch,A. and Apweiler,R. (2000) Nucleic Acids Res., 28, 45–48.[Abstract/Free Full Text]

Bujnicki,J.M., Elofsson,A., Fischer,D. and Rychlewski,L. (2001) Proteins, S5, 184–191.[CrossRef]

Gribskov,M. and Veretnik,S. (1996) Methods Enzymol., 266, 198–212.[ISI][Medline]

Henikoff,S. and Henikoff,J.G. (1992) Proc. Natl Acad. Sci. USA, 89, 10951–10919.

Huang,Z., Prusiner,S. and Cohen,F.E. (1995) Fold Des., 1, 13–19.[Medline]

Kaneko,K., Zulianello,L., Scott,M., Cooper,C.M., Wallace,A.C., James,T.L., Cohen,F.E. and Prusiner,S.B. (1997) Proc. Natl Acad. Sci. USA, 94, 10069–10074.[Abstract/Free Full Text]

Karplus,K., Karchin,R., Barrett,C., Tu,S., Cline,M., Diekhans,M., Grate,L., Casper,J. and Hughey,R. (2001) Proteins, S5, 86–91.[CrossRef]

Lehmann,S., Milhavet,O. and Mange,A. (1999) Biomed. Pharmacother., 53, 39–46.[CrossRef][ISI][Medline]

McGuffin,L.J and Jones,D.T. (2003) Bioinformatics, 19, 874–881.[Abstract/Free Full Text]

Mo,H., Moore,R.C., Cohen,F.E., Westaway,D., Prusiner,S.B., Wright,P.E. and Dyson,H.J. (2001) Proc. Natl Acad. Sci. USA, 98, 2352–2357.[Abstract/Free Full Text]

Nicholson,E.M., Mo,H., Prusiner,S.B., Cohen,F.E. and Marqusee,S. (2002) J. Mol. Biol., 316, 807–815.[CrossRef][ISI][Medline]

Pearson,W.R. (1998) J. Mol. Biol., 276, 71–84.[CrossRef][ISI][Medline]

Prusiner,S.B, Scott,M.R., DeArmond,S.J. and Cohen,F.E. (1998) Cell, 93, 337–348.[ISI][Medline]

Riek,R., Wider,G., Billiter,M., Hornemann,S., Glockshuber,R. and Wutrich,K. (1998) Proc. Natl Acad. Sci. USA, 95, 11667–11672.[Abstract/Free Full Text]

van den Pol,A.N., Reuter,J.D. and Santarelli,J.G. (2002) J. Virol., 76, 8842–8854.[Abstract/Free Full Text]

Wille,H., Michelitsch,M.D., Guenebaut,V., Supattapone,S., Serban,A., Cohen,F.E., Agard,D.A. and Prusiner,S.B. (2003) Proc. Natl Acad. Sci. USA, 99, 3563–3568.[ISI]

Received September 2, 2003; accepted September 12, 2003





This Article
Abstract
FREE Full Text (PDF)
Alert me when this article is cited
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in ISI Web of Science
Similar articles in PubMed
Alert me to new issues of the journal
Add to My Personal Archive
Download to citation manager
Search for citing articles in:
ISI Web of Science (2)
Request Permissions
Google Scholar
Articles by Kuznetsov, I. B.
Articles by Rackovsky, S.
PubMed
PubMed Citation
Articles by Kuznetsov, I. B.
Articles by Rackovsky, S.