Misuse of standard error of the mean (SEM) when reporting variability of a sample. A critical evaluation of four anaesthesia journals

P. Nagele

Department of Anesthesiology and General Intensive Care, University of Vienna, Austria and Department of Anesthesiology, Washington University School of Medicine, St Louis, MO, USA*Address for correspondence: Department of Anesthesiology and General Intensive Care, University of Vienna, Währinger Gürtel 18–20, A-1090 Vienna, Austria. E-mail: peter.nagele@univie.ac.at

Accepted for publication: December 3, 2002


    Abstract
 Top
 Abstract
 Introduction
 Methods and results
 Discussion
 References
 
Background. In biomedical research papers, authors often use descriptive statistics to describe the study sample. The standard deviation (SD) describes the variability between individuals in a sample; the standard error of the mean (SEM) describes the uncertainty of how the sample mean represents the population mean. Authors often, inappropriately, report the SEM when describing the sample. As the SEM is always less than the SD, it misleads the reader into underestimating the variability between individuals within the study sample.

Methods. The aim of this study was to evaluate the frequency of inappropriate use of the SEM in four leading anaesthesia journals in 2001. The journals were searched manually for descriptive statistics reporting either the mean (SD) or the mean (SEM), and inappropriate use of the SEM was noted.

Results. In 2001, all four anaesthesia journals published articles that used the SEM incorrectly: Anesthesia & Analgesia 27.7%, British Journal of Anaesthesia 22.6%, Anesthesiology 18.7% and European Journal of Anaesthesiology 11.5%. Laboratory reports and clinical studies were equally affected, except for Anesthesiology where 90% were basic science reports.

Conclusions. One in four articles (n=198/860, 23%) published in four anaesthesia journals in 2001 inappropriately used the SEM in descriptive statistics to describe the variability of the study sample. Anaesthesia journals are encouraged to provide clearer statistical guidelines on how to report data variability in descriptive statistics.

Br J Anaesth 2003; 90: 514–16

Keywords: statistics


    Introduction
 Top
 Abstract
 Introduction
 Methods and results
 Discussion
 References
 
When reporting data in biomedical research papers, authors often use descriptive statistical methods to describe their study sample. Descriptive statistics aim to describe a given study sample without regard to the entire population; inferential statistics generalize about a population on the basis of data from a sample of this population.

If normally distributed, the study sample can be described entirely by two parameters: the mean and the standard deviation (SD). The SD represents the variability within the sample; the larger the SD, the higher the variability within the sample.1 Although it is clear that samples should always be summarized by the mean and SD,25 authors often use the standard error of the mean (SEM) to describe the variability of their sample. The SEM is used in inferential statistics to give an estimate of how the mean of the sample is related to the mean of the underlying population. As the SEM is always smaller than the SD, the unsuspecting reader may think that the variability within the sample is much smaller than it really is. Although the SD and the SEM are related (SEM=SD/{surd}n), they give two very different types of information.6 Whereas the SD estimates the variability in the study sample, the SEM estimates the precision and uncertainty of how the study sample represents the underlying population.1 7 In other words, the SD tells us the distribution of individual data points around the mean, and the SEM informs us how precise our estimate of the mean is.3 It is therefore inappropriate and incorrect to present data only as the mean (SEM).

This evaluation was designed to identify the frequency of this statistical error in articles published in 2001 in four leading anaesthesia journals: two from the USA (Anesthesiology and Anesthesia & Analgesia), and two from Europe (British Journal of Anaesthesia and European Journal of Anaesthesiology).


    Methods and results
 Top
 Abstract
 Introduction
 Methods and results
 Discussion
 References
 
All articles published in Anesthesiology, Anesthesia & Analgesia, British Journal of Anaesthesia or European Journal of Anaesthesiology in 2001 were searched manually for descriptive statistics reporting either mean (SD) or mean (SEM). Inappropriate use of the SEM in text, figures and tables was noted when the SEM was used to describe the variability of the study sample (instead of SD). Excluded from this analysis were articles using median and range, and articles that solely used inferential statistics such as confidence intervals (CI). Case reports and review articles were not considered, except for case series comprising several cases and using descriptive statistics to describe the study sample.

A total of 257 articles fulfilled the search criteria in Anesthesiology, 405 articles in Anesthesia & Analgesia, 137 in the British Journal of Anaesthesia, and 61 in the European Journal of Anaesthesiology. Detailed results are given in Table 1, where the four journals are listed in order of decreasing percentage misuse of SEM. Eight articles each in Anesthesiology and Anesthesia & Analgesia even failed to state which parameter was used. It must be noted that in some of the articles that incorrectly used the SEM, both parameters, SEM and SD, were used. In these articles, the SD was mostly found in the text and the SEM in the figures.


View this table:
[in this window]
[in a new window]
 
Table 1 Frequency of use of standard error of the mean (SEM) and standard deviation (SD) in four anaesthesia journals, listed in order of decreasing percentage misuse. Data are numbers of articles (%). *Some of these articles used both the SD and the SEM to describe the study sample
 

    Discussion
 Top
 Abstract
 Introduction
 Methods and results
 Discussion
 References
 
This evaluation of four leading anaesthesia journals shows clearly that a significant number of published articles (mis-)use the SEM in descriptive statistics, which may be misinterpreted as showing the variability within the study sample. This use is not only statistically inappropriate, it also makes the reader assume a much smaller variability of the sample. In general, the use of the SEM should be limited to inferential statistics where the author explicitly wants to inform the reader about the precision of the study, and how well the sample truly represents the entire population. Thus, in inferential statistics, the use of SEM is valid but the CI is more valuable. In graphs and figures, use of SD is preferable to the SEM but the SEM can be used to improve the interpretation of the figure if the number of individuals/experiments and the CI are clearly stated.

In conclusion, in spite of clear recommendations, the SEM is still widely and inappropriately used in the anaesthesia literature. Anaesthesia journals could easily avoid this statistical error by requiring authors to adhere to statistical recommendations, for instance through a more stringent statistical review process. The goal should be to have one standard method to describe the distribution of a study sample, thereby reducing confusion among the readers of biomedical research papers.


    Acknowledgements
 
The author wishes to thank C. Michael Crowder MD PhD for his helpful comments on the manuscript, and Dennis M. Fisher MD and Doug Altman DSc for valuable comments on biomedical statistics.

The author was supported by the Fonds zur Förderung der Wissenschaftlichen Forschung (FWF)—Austrian Science Fund—as the recipient of an Erwin-Schrödinger research fellowship.


    References
 Top
 Abstract
 Introduction
 Methods and results
 Discussion
 References
 
1 Glantz S. Primer of Biostatistics, 4th Edn. New York: McGraw-Hill, 1997

2 Fisher DM. Research design and statistics in anesthesia. In: Miller RD, ed. Anesthesia, 5th Edn, Vol. 1. Philadelphia: Churchill Livingston, 2000; 753–92

3 Streiner DL. Maintaining standards: differences between the standard deviation and standard error, and when to use each. Can J Psychiatry 1996; 41: 498–502[ISI][Medline]

4 Altman DG, Gore SM, Gardner MJ, et al. Statistical guidelines for contributors to medical journals. Br Med J 1983; 286: 1489–93[ISI][Medline]

5 Lang TASM. How to report statistics in medicine: annotated guidelines for authors, editors, and reviewers. Philadelphia: American College of Physicians, 1997

6 Carlin JB, Doyle LW. Basic concepts of statistical reasoning: standard errors and confidence intervals. J Paediatr Child Health 2000; 36: 502–5[CrossRef][ISI][Medline]

7 Webster CS, Merry AF. The standard deviation and the standard error of the mean. Anaesthesia 1997; 52: 183