Revised Draft Guidelines for Proteomic Data Publication

July 14, 2005

Dear Colleague:

The attached draft is a proposed set of guidelines, drafted by a group of scientists, engineers and bioinformaticians at a recent workshop held in mid May at the Maison de la Chimie in Paris, France, for preparing, reviewing and publishing data, primarily generated from MS/MS experiments, that deals with the identification of proteins and peptides. The purpose of this endeavor was to formulate standards that would give practioneers and editors alike an appreciation of what a diverse group of stakeholders in this activity considered to be the necessary level of information to reasonably insure the integrity of assignments and thus preserve the accuracy of the scientific record. While the number of participants in the workshop was necessarily small, the number of interested individuals potentially affected is not, and it was the unanimous opinion of all present that this draft should be widely circulated for additional comments, criticism and suggestions. Accordingly, it is being posted, circulated and otherwise distributed as broadly as possible and any recipient should feel to pass it on to any and all parties with an interest in this subject. Comments should be sent to me at where they will be collected and returned to the original working group for consideration, discussion and assimilation. The period of comment will be approximately three months and will end October 15th. The final version, when released by the Committee, will then be circulated to Editors and Publishers of all journals with an interest in proteomics and protein identification for their consideration and, it may be hoped, ratification. It is anticipated that this process will be complete by the end of this calendar year.

By way of added information, Molecular and Cellular Proteomics will post these draft guidelines on its web site (www.mcponline.org) along with a list of the Paris participants and the power point presentations made at that meeting that were used as background material for the preparation of the draft. Other individuals and organizations may feel free to do so as well.

We hope that you will take some of your valuable time to look at this draft and to give us the benefit of your wisdom and experience in the form of comments and suggestions. I believe I speak for all those involved when I say that the value of these guidelines will be in improving and strengthening protein identification data and the substantial value that that information will have in future applications. Thus achieving a true community standard is a highly desirable goal.

Sincerely,

Ralph A. Bradshaw For the workshop participants

DRAFT

Proposed Publication Guidelines for the Analysis and Documentation of Peptide and Protein Identifications.

1. The following supporting information should be included with the manuscript:

2. Information for each protein sequence identified should specify the following:

3. Additional potentially valuable information could include the retention time of each peptide, the observation of multiple charge states, multiple observations of the same peptide, flanking residues, start and end positions of peptides in proteins, and any platform-specific information.

4. Manuscripts presenting studies based on quantitative proteomic data obtained through the use of stable isotope labeling methods or other means such as direct measurement of mass spectrometric signal intensity must contain the following information:

In order to be of general or specific interest to the scientific community, biological interpretations or hypothesis generated on the basis of quantitative protein changes, must be validated to an appropriate level in the same manuscript. This may take the form of checking for consistency with the scientific literature or by the presentation of further substantiating data at least on a subset of the presented findings.

Authors are also encouraged to refer to two recent reviews by D.F. Ransohoff relevant to the interpretation of proteomic data (Nat. Rev. Cancer 4, 309–314 (2004) and Nat. Rev. Cancer 5, 142–149 (2005).

Authors are also encouraged to make custom software available to the community.

5. Studies focusing on posttranslational modifications require specialized methodology and documentation to assign the presence and the site(s) of modification. Certain modifications are also nominally isobaric (e.g. acetylation vs. trimethylation, phosphorylation vs. sulfation). If one of these modifications is being reported, then evidence for assigning a specific modification over another needs to be presented. Examples for methods for distinguishing between these include mass spectrometric approaches such as accurate mass determination, observation of signature fragment ions (e.g. m/z 79 vs. m/z 80 in negative ion mode for assignment of phosphorylation over sulfation) or biological or chemical strategies.

In the tabular presentation of the data, authors are required to show 1) the sequence of the peptide used to make each such assignment, together with the amino acids N- and C-terminal to that peptide’s sequence, 2) the precursor mass and charge (not just m/z) observed, and 3) the search scores for each peptide. Frequently more than one possible site of modification exists within a peptide. Assignment of specific site(s) of modification requires observation of fragment ions that distinguish between the possible sites. When ambiguity with regard to the modification site cannot be resolved, then the ambiguity must be explicitly shown in the tables, e.g. ALEG(sss)YLLK where one of the three serine residues in parentheses is phosphorylated, but information in the spectrum does not permit assignment of a specific residue.

Copies of the annotated, mass labeled spectra for each modified peptide must be submitted electronically together with the manuscript for review purposes. Authors are encouraged to present all or representative spectra of posttranslationally modified peptides either in the body of the text or as supplemental material. In addition, authors are encouraged to provide the corresponding peak (m/z and intensity) lists for review.

6. While more reliable results for peptide identification are generally produced by MS/MS data, in selected circumstances, such as analysis of 2D gel spots, peptide mass fingerprinting can be an effective choice of technique for protein identification. For each identification, an annotated mass spectrum must be supplied. We also encourage the submission of the peak lists for review. In the tabular presentation of the results the authors must supply: 1) the number of matched peaks, 2) the number of unmatched peaks, and 3) the sequence coverage. In addition to the score for the top match they must also show the score for the highest ranked hit to a non-homologous protein. They must describe the parameters and thresholds used to analyze the data (see guideline 1, above), including mass accuracy, resolution, means of calibrating each spectrum, and exclusion of known contaminant ions (keratin, etc.). Authors are required to use and provide the results of scoring schemes that provide a measure of identification certainty, or perform some measure of the false-positive rate.

7. Identical peptide sequences can be included in multiple unique protein sequences due to biological variation such as single amino acid variants, alternative splice forms, homologs, orthologs and paralogs. Other reasons for apparent redundancy in protein sequence database entries are the inclusion of sequence fragments and sequences with errors. Apparent redundancy can also occur due to clerical errors arising from the merger of multiple sequence databases or identical protein sequences appearing under different names or accession numbers.

Experimental strategies based on proteolytic digestion of protein mixtures introduce the complication of loss of connectivity between peptides and their protein precursors. Assignment of peptide sequences results in two outcomes; distinct peptides that map to only one protein sequence or shared peptides that map to more than one protein sequence. Detection of shared peptides introduces a paradox between the possibility that a shared peptide can be mapped to more than one protein sequence (bioinformatics redundancy) versus the possibility that more than one precursor is in the original protein mixture (physical redundancy). The apparent ambiguity in peptide assignment requires reporting of a protein group. When assembling peptides into proteins and protein groups, authors should adhere to principles of parsimony, i.e. describe the minimum set of protein sequences that adequately accounts for all observed peptides. While the identification of shared peptides implies that multiple related protein sequences are present, the initial assumption should be that only a single form is being detected. Authors should explain and be able to justify cases where a single protein from a protein group has been singled out or that more than one member of a protein group is present. When reporting a summary list of peptides belonging to each protein group, peptides shared among multiple proteins and those unique to a protein should be clearly indicated. In addition, sometimes proteins are identified from a different species than the one being studied. For example, identification of a mouse or human protein in a hamster study. If such an orthologous protein is included, the circumstances should be mentioned and justified.

8. It is strongly encouraged (but not yet required) that all MS/MS spectra mentioned in the paper be submitted as supplemental material. Journals will vary in their ability to handle this information and authors are encouraged to provide access to raw MS data using other means, including group websites and public repositories, as they emerge, in addition to the journal itself.





This Article
Full Text (PDF)
Submit a response
Purchase Article
View Shopping Cart
Alert me when this article is cited
Alert me when eLetters are posted
Alert me if a correction is posted
Services
Email this article to a friend
Similar articles in this journal
Similar articles in PubMed
Alert me to new issues of the journal
Download to citation manager
Glossary
Copyright Permissions
Articles citing this Article
PubMed
PubMed Citation


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 All ASBMB Journals   Journal of Biological Chemistry 
 Journal of Lipid Research   Biochemistry and Molecular Biology Education