Physiological Genomics, along with the other APS journals, now requires authors of manuscripts containing microarray data to submit complete information about their data sets and the data gathering process, in keeping with the "minimum information about a microarray experiment" (MIAME) standards instituted by the Microarray Gene Expression Data society (MGED) working group (http://www.mged.org/Workgroups/MIAME/miame.html). This editorial outlines the history of this initiative, details the utility of publication of such information, and provides authors with links and descriptions to assist them with manuscript preparation.
History.
In 2001, members of the MGED (http://www.mged.org) recommended formalized standards for the publication of microarray data (1). They noted that an important principle in peer-reviewed biomedical literature is the requirement for provision of materials associated with an experiment in order to encourage assessments of reproducibility. Late last year, a working group of the MGED contacted editors of several genomics journals, including Physiological Genomics, to encourage them to adopt their recommended guidelines in the instructions to authors of each journal. In response, Nature announced that, effective December 1, 2002, authors of manuscripts containing new microarray data must submit complete supplemental information to the editor and at an online data repository (2).
Rationale.
Within a short time span, microarrays have become an important, commonly used tool in molecular genetics and physiology research. However, there is not yet widespread standardization. Variations between probe arrays, array reader equipment, software, annotation, and laboratories introduce noise into experimental results and make it more difficult for authors to reach similar conclusions independently. For microarray analysis of gene expression to have any long-term impact, it is crucial that the issue of reproducibility be adequately addressed.
One way to do that is to require authors to provide readership with sufficient information either to build upon or reproduce published research. In addition, since microarray analytic standards are certain to change, it is crucial that authors identify the nature of the experimental conditions prevalent at the time of their research. Genomics is a rapidly evolving field, and microarray technology is continually improving to yield more accurate results. Nonetheless, if todays research is to be relevant tomorrow, then the core elements that are impervious to obsolescence must be made clear.
Physiological Genomics has adopted the MIAME standards to ensure that what is cutting-edge today is not out-of-date 5 years hence. We and other genomics journals have previously published papers describing expression analysis that do not provide complete microarray information. These papers have a shorter citation half-life than ones that make supplemental data readily available to the public. For Physiological Genomics to build upon its reputation and enhance the value of microarray studies, then, taking the step of requiring more from our authors will help both them and us in the long run.
Guidelines.
Guidelines are provided in the Information for Authors at the American Physiological Societys Publications web page(http://www.the-aps.org/publications/i4a/prep_manuscript.htm#miame_standard). We have attempted to make this a streamlined process so that authors can readily upload data files that they will have already generated in the process of carrying out their research. We have included a link to the MGED societys MIAME web pages, where this information has already been summarized in useable form. We inform authors that supplemental data should be in the form of tab-delimited tables or Excel spreadsheets. On our site, the summarized published guidelines to the format indicate that the first table or spreadsheet "could contain the raw output of the image analysis software (spot quantitation matrix), the second could contain the processed data following normalization and transformation (gene expression data matrix), and if one is produced, the final table could contain summary data that was ultimately used in the analysis, such as the subset of differentially expressed genes identified or gene clusters."
We request that authors provide the supplemental data at the time of manuscript submission, just as they would submit new nucleic acid sequences to GenBank. In addition, we require that data be deposited at the Gene Expression Omnibus (GEO) web site of the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/geo/). This public data repository has the same high standards, ease of user interface, and attention to detail as GenBank, is readily searchable, and will be long-lived. While authors are free to maintain their data sets on their personal or research institution web pages as well, these pages are likely to come and go in the course of an academic career. Last, authors should provide the URL for their GEO-deposited data in the "Materials and Methods" section of their manuscript, so that referees may access it during the peer review process.
We would like to thank authors of future submissions for complying with the MIAME standards because in addition to augmenting the microarray database, they will enhance the value of their publications and of Physiological Genomics overall. We encourageresearchers to deposit data from previously published papers to ensure the scientific longevity of their efforts and to otherwise support this initiative of the functional genomics research community.
FOOTNOTES
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
Address for reprint requests and other correspondence: S. B. Glueck, Thorn 1324C, 20 Shattuck St., Brigham and Womens Hospital, Boston, MA 02115 (E-mail: sglueck{at}rics.bwh.harvard.edu).
On behalf of the Senior and Associate Editors
REFERENCES