Columbia University School of Public Health, 600 W168th Street, New York, NY 10032, USA. Email: mws2{at}columbia.edu
Accepted 27 February 2001
Cohorts are closed populations defined and bounded by their timepoints of entry, often but not necessarily at birth. In this issue, a classic article by Kermack, McKendrick and McKinley reprinted from the Lancet of 1934 illustrates an early use of cohort analysis. Such analyses follow successive generations of entrants through the life course. The object is to link the pattern of specified outcomes to the particular previous experience defined by membership of a generation. The outcome of interest to Kermack et al. was overall mortality at successive ages; their significant discovery was to demonstrate the potentially large and continuing contribution of experience in the earliest years to rates of death throughout the lifetime of each generation.
This newly-invented approach cast bright light on the longitudinal perspective of health and disease through the life course. It was a latecomer in the history of relevant methods. Halley's life table of 1693 is credited as the first valid numerical approach. A major figure in astronomy and mathematics, he used it to challenge the claim of astrologers that certain years in a man's life were predictably hazardous.1 William Farr, a founder of quantitative epidemiology, made good use of the life table method2 and other approaches to longitudinal analysis. During 40 years from 1839 as Compiler of Medical Statistics for England and Wales, Farr invented the cohort life table, a method widely used in our times to follow the institutional careers of mental patients. He also devised a cohort study i.e. longitudinal follow-up of outcomes in a closed population defined by a specific point of entry or exposure (not that he named them as such). In the latter instance, he constructed a retrospective analysis of records of admissions and deaths in a mental institution. With that, he demonstrated positive benefit from John Connolly's moral treatment, a therapeutic revolution of its time in psychiatry. In the institution Connolly directed, mortality was decidedly less than in others.
The advent of cohort analysis, however, had to wait until the later 1920s in the actuarial world (for two papers presented to the Faculty of Actuaries by Derrick and by Davidson and Reid3), and until 1930 in epidemiology (if one can be allowed to skip a scarcely instructive 1927 paper by Korteweg). In the latter year, Andvord published a paper in which he examined the outcome of tuberculosis in successive generations.4 The next significant epidemiological contribution to cohort analysis of which I am aware was the 1934 paper under review. Before addressing the paper, one should note subsequent early contributions by VH Springett,5 RAM Case,6 and MacMahon and Terry,7 each of which advanced epidemiological understanding of the method. It may also be useful first to define the three elements involved in such analysis.
Cohort or generation effects are ultimately environmental effects unique to successive cohorts, even if they should happen to involve selective hereditary pressures on a population. They are attributable to the singular experience of each cohort antecedent to the observation of the outcome under investigation. In the subsequent life course of an assembly of cohorts, the imprint of such experience may be detectable from regularities or irregularities in the outcome of a set of cohorts. Since interaction of age with experience is present virtually always, among generations the same age groups differ from all others by the imprint of their own unique slice of time. Thus the experience of every entire cohort is unique to itself.
Period effects are environmental effects common to a given time period. The extant population, all those living at that time, is at risk. Generations exposed to the same time period overlap in experience, but at different ages. Each period impinges on the survivors of many generations in their several age groups, so that again the collective period-defined experience of this assembly is unique.
Age effects might best be defined here as intrinsic effects dependent on the differentiation and maturation of the individual organism and independent of period and generational experience. Yet age sets the limits of both period and generational experience and, in interaction with period and generation, age moderates all outcomes.
In summary, for individuals or populations, any time period defined by calendar dates reflects three distinct dimensions: it specifies prevalent environment; it enfolds the unique historical experience of generations up to that time; and, in defining age, it is a measure of the duration of both experience and intrinsic maturation from conception to the given date. The relations among these three types of effect, however, entail a problem at the core of every analysis involving any two. All three effects, although distinct, operate simultaneously. To isolate the contribution of any one of the three they must be disengaged from each other.
Definitions of outcomes within age group, period, or generation inevitably overlap. Age is defined and measured by the interval between two dates, dates of birth and of observation. Cohorts are defined by the dates bounding recruitment (usually years of birth), and periods are defined by dates bounding observations of outcome. Of the three separate effects, only two can be examined simultaneously because one of the variables that defines any one type of these effects or outcomes is structurally linked to the definition of one of the other two effects. This circumstance, permitting the simultaneous inclusion of only two of the three variables (or effects), always leaves an uninvolved third variable uncontrolled. Any one analysiswhether the graphic approach most often used in epidemiology or a statistical oneis thus obliged to make untested assumptions about the third unspecified variable in any analysis of age, period or cohort.
It follows that wherever an outcome measure is seen to change through time in both its frequency and its age distribution, one will confront a disparate view of changes in outcome in both cohort and period analysis. These unavoidable analytical disparities require, for their resolution, some kind of validation outside the data on view.
Such validation enjoins judgement resting on external evidence. First call is on the patterns exhibited in the analysis of the three effects; one or other will emerge as more coherent with biological, epidemiological and social knowledge, and in particular with understanding about disease, about ageing and about the factors that impinge on the measured outcomes. Most cogently, however, with the passing of time the predictions from one or other pattern will ultimately conform better with the direction of future change.
Readied by the preamble, we turn to consider both the paper under discussion and what followed in its wake for epidemiology. Kermack et al., apparently unaware of Andvord's work on tuberculosis, considered a quite different problem (and in a second paper they give a detailed mathematical exposition of their method).8,9 The question they tackled (which one must assume motivated the work) received no more than oblique mention in their discussion section. Karl Pearson, acknowledged founder of biometrics (and of its first journal), was a socialist but also a social Darwinist. In debating several politically loaded issues during the first decades of the 20th century, he espoused the Darwinian theory of survival of the fittest. He did so in a regrettably crude eugenic form then growing popular; the theory deprecated the weak, the disabled, and the lesser races. Twisting this logic to interpret human society, Pearson saw survival of the less sturdy or hardy as necessarily increasing the social burden of the less fit. Pertinent to this review, among other such things as the necessity to support inherent British superiority in the Boer War and to oppose immigration of inferior races, he had deplored the declining infant mortality rate in Britain since the turn of the 19th century. Self-evident for Pearson was that degeneration of the British race must result from such a continuing process.
The historical analysis of Kermack et al. controverted Pearson's assumption at the same time as it uncovered the remarkable implications of steadily improving mortality in infancy. Their analysis showed that this temporal change was linked, not with any generational decline, but with consistent improvement in mortality rates throughout the life course. The data they compiled spanned 18451925 for England and Wales, 18601930 for Scotland, and 17551925 for Sweden. During an unprecedented period of infant mortality decline from the beginning of the 20th century, mortality in the age groups subsequent to infancyi.e. through the entire life coursecontinued the improvement generation by generation.
For each country, Kermack et al. tabulated mean relative per cent mortality by age group against time period. Reading across the tables, the rows show the data for successive 10-year age groups from the year of birth on. Reading down, the columns show the data for successive decennial time periods. Then, instead of reading the data in conventional mannerdown for each age group and across for each periodthey drew freehand lines to show how to follow each successive generation by reading the data diagonally. This is to say, from interval to interval, the data are read like the moves of the knight in chess, both one down for age group and one across for time period. Upon which, in successive generations, a decline in death rates throughout the life course clearly emerges.
With rather few exceptions, given such dramatic illumination of a major biological and political issue, epidemiologists were slow to follow this lead. In 1939, in a posthumously published paper, Wade Hampton Frost presented a cohort analysis of national mortality from tuberculosis.10 Evidently the epidemiological works of Andvord, Kermack et al., and the previous actuarial papers were unknown to him, but the disease was then of greatest importance and had been one of his longstanding interests. Frost's analysis of mortality by age group over time, from 1900 to 1930, was subsequently extended by Doege11 to 1960 (Figures 1a, 1b).
At first sight, the period analysis shows a biologically puzzling incoherence: mortality declines over time, but with age distribution changing from period to period (excepting a sharp drop from infancy to age 10). Beyond age 10, in each period the successively lower age group contours rise as expected across the age groups, but also grow steeper and reach their peak later than preceding contours. Only when cast in cohort form do the age contours become consistent across generations and periods: one then sees, after the sharp fall up to age 10, that the lower contours for each succeeding generation are now parallel with a regular peak at age 20.
This analysis had considerable pathobiological implications. It suggested that early seemingly innocent primary infection carried consequences that manifested in secondary forms of tuberculosis at later ages (supported some years later by the extensive work of Springett5,12). It also shows more generally that, in conventional cross-sectional period analysis, inconsistent distributions both across age groups and across time periods indicate changes over time among the assembled generations. This is also to say that environmentally induced generation effects are present. Thus, with tuberculosis, in later generations mortality declines at younger ages in period age contours, but a residue of high mortality remains in successively older age groups who were the survivors of previous high disease frequencies.
Lung cancer, everyone now knows, was a rising epidemic over the first three-quarters of the 20th century. Here, the early period analyses of lung cancer mortality by age were again problematic, but reversed the problematic age pattern found for tuberculosis. Cigarette consumption and addiction on a large scale had spread first among young males, especially during World War I, to be reflected among smokers in lung cancer some years later. The residue of unexposed were among the older age groups and women. Thus period analysis by age over time indicated to some critics that this apparently inconsistent degree of immunity at different ages undermined the causal culpability of tobacco. Cohort analyses13,14 revealed the regular age relationship of the sharply rising death rate for the disease, both within each generation and across succeeding generations (Figure 2) and its consistency with the diffusion of smoking habits. For the disinterested, such demonstrations helped end that particular controversy.
|
The reason emerged from a cohort analysis, although one difficult to decipher.15,16 The disease, first seen in young women with perforating gastric ulcer, in later periods came to be seen mainly as an affliction of men in middle age; these age and sex shifts were in fact unrecognized artifacts of period analysis. Mortality, with some modest variation among its various forms and among the sexes, had been rising sharply among generations born from the early 19th century and into the 1890s. In the generations thereafter, however, overall mortality began a decline inapparent in routine period analysis. This silent change in the direction of trends among generations, first waxing and then waning, created residual low frequencies among the elderly in the early less affected generations together with rising frequencies among the young; in the later most affected generations. The changed direction of the trends created contrasting residual high frequencies among the elderly and lower frequencies among the young. To complicate the picture further the disease, which had first become manifest among the higher social classes became downwardly mobile,15,16 although in an age pattern from period to period consistent with the generational shift.
As noted above, notwithstanding the biological and analytical coherence that the generational perspective has often brought with it, an unavoidable assumption underlies any analysis involving time period, generation and age. External validation is therefore required before interpretation of any one of these three forms of analysis can be entirely secure. Thus, with the peptic ulcer epidemic, such scepticism as greeted the unsuspected decline apparent among cohorts was not out of place at the outset.
In the event, the results were shown to be consistent with several expectations from the analysis both about contemporary data and about future trends. Among these were a decline in frequency of surgery for perforated ulcers combined with the rise in mean age of operation expected with a residue of cases in the elderly; rising age and declining morbidity in each of the three branches of the British Armed Forces; declining trends in national sickness absence records and general practice surveys; and finally, the correct predictions, made from the declining national mortality cohort trends observed into the 1950s, of further decline into later decades.16 A plethora of studies, of which I cite only a few, shows these British peptic ulcer trends also to be the bellwether for trends in the US, in Japan, in at least 18 European countries, and in Australia.1723 The consistent cohort patterns for this disease implicate some sort of generation effect in earlier life experience. Data from more recent years also indicate period changes which probably reflect current exposures in adults.
New findings shed a glimmer of light on these shifts in the life-course pattern of peptic ulcer. Besides possible contributions of smoking to cohort and period effects, and of non-steroidal anti-inflammatory drugs and the like to period effects, the newly apparent role of Helicobacter pylori in gastritis and peptic ulcer provides insight into the waxing and waning of the 20th century epidemic. Notably, the age distribution of the organism in the population relates well to the rising quality of living standards, sanitation and hygiene during the 19th and 20th centuries. Currently, in the less developed world, the incidence of Helicobacter pylori infection can be taken to be high in early childhood, as inferred from prevalence in China.24 In the developed world, prevalence of the organism rises gradually with age25 and, as inferred from age-specific seroprevalence in Yorkshire,26 is coupled with declining prevalence among age groups over 15 years of age in successive cohorts from 1900 to 1979. Thus the course of peptic ulcer through time and among age groups, generations and social classes is reminiscent of the way poliomyelitis impinged on increasingly older age groups as improving socioeconomic conditions sanitation and hygiene reduced polio virus exposure and inadvertent immunization early in life. The behaviour and distribution of Helicobacter pylori is consistent with this analogy, although its precise mode of transmission and its relation to immunity remains unclear.2628
In the past three or four decades, cohort analysis has adumbrated the life-course perspective, and enlightened understanding of several conditions, and not least of the process of ageing itself. Multivariable and multivariate summarizing statistical techniques can be applied to the three linked variables with appropriate adjustments,2935 but the core structural difficulty is not soluble by such means. Intelligent exploration of the tabular and graphic approach is still the readiest analytical method. The heritage of Kermack, McKendrick and McKinley retains its value for present-day epidemiologists.
|
1 Lazarsfeld PF. Notes on the history of quantification in sociology trends, sources and problems. In: Woolf H (ed.). Quantification. New York: Bobbs-Merrill, 1961.
2 Susser M, Adelstein A. Introduction. In: Humphreys NA (ed.). Vital Statistics: A Memorial Volume of Selections from the Reports and Writings of William Farr. Published under the auspices of the Library of the NY Acad Med. Metuchen NJ: Scarecrow Press, 1975, pp.46976.
3 Kuh D, Smith GD. When is mortality risk determined? Historical insights into a current debate. Soc Hist Med 1993;6:10123.[ISI][Medline]
4 Andvord KF. What can we learn from studying tuberculosis by generations? Norsk Mag Loegevidensk 1930;91:64260.
5 Springett VH. Comparative study of tuberculosis mortality rates. J Hyg Camb 1950;48:36195.
6 Case RAM. Cohort analysis of mortality rates as an historical or narrative technique. Brit J Prev Soc Med 1956;10:15971.
7 MacMahon B, Terry WD. Application of cohort analysis to the study of time trends in neoplastic disease. J Chron Dis 1958;7:2435.[Medline]
8 Kermack WO, McKendrick AG, McKinley PL. Death rates in Great Britain and Sweden: some general regularities and their significance. Lancet 1934;i:69870.
9 Kermack WO, McKendrick AG, McKinley PL. Death rates in Great Britain and Sweden: expression of specific mortality rates as products of two factors, and some consequences thereof. J Hyg 1934;3334: 43351.
10 Frost WH. The age selection of mortality from tuberculosis in successive decades. Am J Hyg 1939;30:9196.
11 Doege T. Tuberculosis mortality in the United States, 19001960. JAMA 1963;192:104548.
12 Springett VH. A comparative study of tuberculosis mortality rates. J Hyg 1950;48:36195.[ISI]
13 Haenszel W, Schimkin MB. Smoking patterns and the epidemiology of lung cancer in the United States: are they compatible? JNCI 1956;16: 141741.[ISI][Medline]
14 Hammond EC, Garfinkel L. Changes in cigarette smoking, 195465. Am J Public Health 1968;58:3045.
15 Susser M, Stein Z. Civilisation and peptic ulcer. Lancet 1962;i:11519.
16 Susser M. Period effects, generation effects and age effects in peptic ulcer mortality. J Chron Dis 1982;35:2940.[ISI][Medline]
17 Monson R, MacMahon B. Peptic ulcer in Massachusetts physicians. N Engl J Med 1969;281:1115.[ISI][Medline]
18 Wylie CM. The complex wane of peptic ulcer II. Trends in duodenal and gastric ulcer admissions to 790 hospitals 19741979. J Clin Gastroenterol 1981;3:33337.[ISI][Medline]
19 Sonnenberg A, Muller H, Pace F. Birth cohort analysis of peptic ulcer mortality in Europe. J Chron Dis 1985;38:30911.[ISI][Medline]
20 Sonnenberg A. Causative factors in the etiology of peptic ulcer become effective before the age of 15 years. J Chron Dis 1987;40:193202.[ISI][Medline]
21 La Vecchia C, Lucchini F, Negri E, Reggi V, Levi F. The impact of therapeutic improvements in reducing peptic ulcer mortality in Europe. Int J Epidemiol 1993;22:96106.[Abstract]
22 Westbrook JI, Rushworth RL. The epidemiology of peptic ulcer mortality 19531989: a birth cohort analysis. Int J Epidemiol 1993;22:108592.[Abstract]
23 Svanes C, Lie RT, Kvale G, Svanes K, Soreide O. Incidence of perforated ulcer in Western Norway 19351990: cohort or period-dependent time trends. Am J Epidemiol 1995;141:83644.[Abstract]
24 Mitchell HM, Goggin PM, Li YY et al. Epidemiology of Helicobacter pylori in Southern China: identification of early childhood as the critical period for acquisition. J Infect Dis 1992;166:14953.[ISI][Medline]
25 Megraud F, Brassens Rabbe MP, Denis F, Belbouri A, Hoa DQ. Seroepidemiology of Campylobacter pylori infection in various populations. J Clin Microbiol 1989;27:187073.[ISI][Medline]
26 Banatvala N, Mayo K, Megraud F, Jennings R, Deeks JJ, Feldman RA. The cohort effect and Helicobacter pylori. J Infect Dis 1993;168:21921.[ISI][Medline]
27 Malaty M, Graham DY, Isaksson I, Engstrand L, Pedersen N. Co-twin study of the effect of environment and dietary elements on acquisition of Helicobacter pylori infection. Am J Epidemiol 1998;148:79397.[Abstract]
28 Goodman KJ, Correa P. Transmisssion of Helicobacter pylori among siblings. Lancet 2000;355:35863.[ISI][Medline]
29 Greenberg BG, Wright JJ, Sheps CG. A technique for analyzing some factors affecting the incidence of syphilis. J Am Statist Assoc 1950;45: 37399.[ISI]
30 Mason KO, Mason WM, Winsborough HH, Poole WK. Some methodological issues in cohort analysis archival data. Am Sociol Rev 1973;38:24258.[ISI]
31 Walter SD, Miller CT, Lee JAH. The use of age-specific mean cohort slopes in the analysis of epidemiological incidence and mortality data. JR Statist Soc A, Part 2 1976;139:22745.
32 Barrett JC. The redundant factor method and bladder cancer mortality. J Epidemiol Community Health 1978;32:31416.[Abstract]
33 Osmond C, Gardner MJ. Age, period and cohort models applied to cancer mortality rates. Stat Med 1982;1:24559.[Medline]
34 Glenn NK. Cohort analysts' futile quest: statistical attempts to separate age, period and cohort effects. Am Sociol Rev 1976;41:90004.[ISI]
35 Clayton D, Schiffler E. Modes for temporal variation in cancer rates. II: Age-period-cohort models. Stat Med 1987;6:46981.[ISI][Medline]