Use of period analysis for providing more up-to-date estimates of long-term survival rates: empirical evaluation among 370 000 cancer patients in Finland

Hermann Brennera, Bengt Södermanb and Timo Hakulinenb,c

a Department of Epidemiology, German Centre for Research on Ageing, Heidelberg, Germany.
b Finnish Cancer Registry, Helsinki, Finland.
c Department of Public Health, University of Helsinki, Finland.

Prof. Dr Hermann Brenner, Department of Epidemiology, German Centre for Research on Ageing, Bergheimer Strasse 20, D-69115 Heidelberg, Germany. E-mail: brenner{at}dzfa.uni-heidelberg.de

Abstract

Background Providing up-to-date estimates of cancer patient survival rates is an important task of cancer registries. A few years ago, a new method of survival analysis, denoted period analysis, was proposed to enhance the recency of long-term survival estimates. The aim of this paper is to provide a comprehensive empirical evaluation of the use of this method.

Methods Using data from the nationwide Finnish Cancer Registry, we compare 5-year and 10-year relative survival rates of 371 849 patients diagnosed with one of the 16 most common forms of cancer in Finland at various time intervals between 1953 and 1992 with the most up-to-date estimates of 5-year or 10-year relative survival that might have been obtained in those time intervals by traditional methods of survival analysis and by period analysis of survival.

Results Survival rates strongly increased over time for most forms of cancer. For these cancers, traditional estimates of 5- and 10-year survival rates would have severely lagged behind the survival rates later observed for newly diagnosed patients, and period analysis would consistently have provided much more up-to-date estimates of survival rates.

Conclusions We conclude that period analysis should be implemented as a standard tool for providing up-to-date estimates of long-term survival rates by cancer registries.

Keywords Cancer registry, statistical methods, survival

Accepted 24 September 2001

Long-term survival rates, such as 5-year or 10-year survival rates, are essential outcome measures of cancer, and they are now routinely reported by many cancer registries around the world.1–5 However, long-term cumulative survival estimates obtained by traditional survival analysis pertain to cohorts of patients diagnosed many years ago,6–8 which hinders early detection of improvements in cancer prognosis. In the case of improved survival, which has been achieved for many common forms of cancer, available traditional long-term survival estimates lag behind the survival experience to be expected by newly diagnosed patients, and they may thereby unduly discourage newly diagnosed cancer patients and their physicians.

A few years ago, a new method of survival analysis, denoted ‘period analysis‘, was proposed to derive more up-to-date long-term survival estimates.9,10 In contrast to traditional ‘cohortwise’ analysis of survival rates, period analysis derives long-term survival estimates exclusively from the survival experience of patients within some recent calendar period. Nevertheless, with few exceptions,11–14 this method has been rarely applied in practice, and the method has not been systematically validated using large datasets.

The aim of this paper is to provide a comprehensive empirical evaluation of the performance of period analysis compared to traditional methods of survival analysis for providing up-to-date estimates of long-term cancer patient survival.

Methods

Database
Our analysis is based on data from the Finnish Cancer Registry which are among the highest quality data of any population-based cancer registry in the world.15 The Finnish Cancer Registry covers the whole of Finland (population 5.1 million). Notification of new cancer cases to the cancer registry is mandatory by law, and the registry obtains information from many different sources, including hospitals, physicians working outside hospitals, dentists, and pathological and cytological laboratories. The Finnish Cancer Registry also receives copies of all death certificates where cancer is mentioned. Virtually complete population-based cancer registration has been accomplished since 1953.15

Follow-up of cancer patients with respect to vital status is extremely efficient in Finland due to the existence of personal identification numbers. The files of the cancer registry are matched annually with the annual list of deaths using the personal identification number as the key. In addition, the cancer registry data file is actively matched with the central population register (a register of all people currently alive and living in Finland) as an additional check on the vital status of the patients. Such checks against the central population register have shown that, since computerized matching began in 1975, only a small proportion (0.05%) of deaths are missed when the annual matching is performed.16 By the time of this analysis, follow-up with respect to vital status was complete up until the end of 1997.

Our analysis includes patients diagnosed with one of the 16 most common forms of cancer (excluding non-melanoma skin cancer) in Finland between 1953 and 1992. Patients whose cancer was registered by death certificate only (about 2% of registered cases), who were lost to follow-up (0.1%) or whose month of death was unknown (0.1%) were excluded.

Statistical analysis
Throughout this paper, we present relative rather than absolute survival rates. Relative survival rates (RSR) are the preferred measures of survival reported by cancer registries because they are unaffected by deaths from causes other than the primary cancer of interest.8 The RSR, which represents the survival rate in the hypothetical situation where the cancer in question is the only possible cause of death, is defined as the absolute survival rate in the patient group divided by the expected survival of a comparable group from the general population. The expected survival rates were estimated from nationwide population life tables stratified by age, sex and calendar time according to the approach commonly known as the Ederer II method (with minor adaptations for application of period analysis).17 With that method absolute cumulative survival rates and expected cumulative survival rates are calculated as the product of interval-specific (conditional) survival rates of cancer patients and of people from the general population of the same sex and age, respectively. It has been shown empirically that this method works very well for follow-up times of patients up to 10 years.

For each cancer site, the 5-year and 10-year RSR actually observed for patients diagnosed in 5-year time intervals between 1958 and 1992 were calculated. These observed RSR were compared with the most up-to-date estimates of RSR that might have been available at those time intervals from previously registered patients using either traditional methods of analysis or period analysis. Five-year intervals were chosen to ensure adequate precision of all survival estimates. For simplicity, we neglect any delay in cancer registration and analysis of results.

In Figure 1Go the procedure is illustrated for patients diagnosed in 1988–1992, whose 5-year survival was completed in the 1993–1997 interval. The observed 5-year RSR of these patients is based on their survival experience in 1988–1997. Which estimates of 5-year RSR might have been available at the time of diagnosis of these patients, i.e. in the 1988–1992 interval?



View larger version (23K):
[in this window]
[in a new window]
 
Figure 1 Illustration of data base used to calculate observed 5-year survival of patients diagnosed in 1988–1992 as well as the most up-to-date survival estimates that might have been obtained in 1988–1992 by cohort analysis, by complete analysis and by period analysis. The figures within the cells indicate the years of follow-up since diagnosis

 
One traditional option of survival analysis, denoted ‘cohort analysis’ by Brenner and Gefeller,10 would have been to calculate 5-year RSR of the most recent cohort of patients that completed 5-year follow-up in 1988–1992, that is patients diagnosed in 1983–1987, and this cohort estimate would reflect survival experience of these patients in 1983–1992.

Another option in traditional survival analysis, denoted ‘complete analysis‘,10 would have been to also include survival experience of more recently diagnosed patients, that is, patients diagnosed in 1988–1992 in the analysis. These patients would not have completed 5-year follow-up in the 1988–1992 interval, but nevertheless could have been included as ‘right censored’ observations in the analysis. Due to these inclusions, complete analysis (encompassing survival experience of patients diagnosed between 1983 and 1992) would not only have provided somewhat more up-to-date, but also more precise, survival estimates than cohort analysis.

Like the complete estimates, the most recent period estimates of 5-year RSR that might have been obtained in the 1988–1992 interval would have been derived from the survival experience of patients diagnosed in 1983–1992. In contrast to complete analysis, however, period analysis would have been restricted to their survival experience during the 1988–1992 period. This would have been achieved by left truncation of observations at the beginning of the 1988–1992 interval in addition to ‘right censoring’ at its end as previously described,9,10 thereby providing more up-to-date estimates of 5-year survival rates.

To quantify recency and accuracy of survival estimates, we calculated the mean difference and the mean squared difference between the latest estimates of 5-year RSR that might have been obtained by the three different methods in any of the 5-year calendar intervals between 1958–1962 and 1988–1992 and the 5-year RSR later observed for patients diagnosed in those intervals. We started with the 1958–1962 interval because this is the first interval at which estimates of 5-year RSR could have been obtained with registry data available since 1953. We ended with the 1988–1992 interval because the cohort of patients diagnosed in this interval was the most recent cohort for whom 5-year RSR was known with follow-up information available by the end of 1997. Along the same lines, we calculated the mean difference and the mean squared difference between the latest estimates of 10-year RSR that might have been obtained in any of the 5-year calendar intervals between 1963–1967 and 1983–1987 with the three different methods and 10-year RSR later observed for patients diagnosed in those intervals.

The mean difference calculated in that way is essentially a measure of mean systematic under- or overestimation of survival rates of newly diagnosed patients by survival estimates available at the time of their diagnosis. The mean squared difference essentially quantifies the degree of deviation (regardless of its direction) of the most up-to-date available survival estimates from survival rates of newly diagnosed patients. It is not only determined by the degree of systematic over- or underestimation of survival rates, but also by the random variation in survival estimates.

Results

Table 1Go shows the numbers, the proportions of females and median age of cancer patients by cancer site. The total number of patients included in this analysis was 371849. The most frequent cancer diagnosis in Finland between 1953 and 1992 was lung cancer, followed by breast cancer and stomach cancer. Almost 90% of patients with lung cancer were male, whereas the proportion of male and female patients were similar for most other non-gynaecological cancers. Median age ranged from 50 years for patients with cancer of the nervous system to 73 years for patients with prostate cancer, and it was between 60 and 70 years for most other forms of cancer.


View this table:
[in this window]
[in a new window]
 
Table 1 Cancer sites, numbers of patients, proportion of females and median age at diagnosis of patients included in this analysis
 
Table 2Go shows the 5-year RSR of patients diagnosed in 1953– 1957 and in 1988–1992, respectively (the earliest and the latest cohort of patients for whom these rates could be derived), and the mean annual change (expressed in % units) in 5-year RSR by cancer site, ranked according to improvement of survival rates over time.


View this table:
[in this window]
[in a new window]
 
Table 2 Five-year relative survival rates (RSR) of patients diagnosed in 1953–1957 and 1988–1992, mean annual change in 5-year RSR over time, and mean difference and mean squared difference between 5-year RSR estimates available in 5-year intervals between 1958–1962 and 1988–1992 by different types of analyses and 5-year RSR later observed for patients diagnosed in those intervals. The three right-hand columns provide the most up-to-date cohort, complete and period estimates of 5-year RSR available in the 1993–1997 interval. Cancer sites are ranked according to improvement in 5-year RSR over time
 
Among patients diagnosed in 1953–1957, cancer of the corpus uteri (64.4%), the breast (52.2%) and the cervix uteri (51.0%) were the only types of cancer with a 5-year RSR above 50%. Among patients diagnosed in 1988–1992, 5-year RSR were much higher for most cancer sites and exceeded 50% for melanoma of the skin (81.4%), cancer of the breast (80.9%), corpus uteri (78.1%), urinary bladder (68.3%), prostate (63.3%), nervous system (55.2%), cervix uteri (54.2%), kidneys (53.2%), and colon (52.1%). Mean annual increase in 5-year RSR was most pronounced and exceeded 1% unit for melanoma and cancer of the urinary organs and the prostate, but a very high mean increase close to 1% unit per year was also observed for leukaemia, and for cancers of the colon, nervous system and breast. On the other hand, the absolute increase in 5-year RSR was very modest or even absent for the cancers with the lowest 5-year RSR, including cancer of the oesophagus, pancreas and lung.

For most cancer sites, the most up-to-date cohort estimates of 5-year RSR available in various time intervals would, on average, have been much lower (as indicated by the negative values of mean differences) than the 5-year RSR later observed among patients whose cancer was newly diagnosed in those intervals (Table 2Go). The difference would have been most pronounced for those cancer sites with the strongest increase in 5-year RSR over time (e.g. more than 5% units for melanoma and cancer of the bladder, prostate, and kidneys). Only for cancers with no or little improvement in 5-year RSR over time, such as cancers of the oesophagus, lung, cervix uteri and pancreas, would the cohort estimates have come close to the 5-year RSR of newly diagnosed patients. The mean difference from 5-year RSR later observed for newly diagnosed patients would have been smaller for estimates obtained with complete rather than with cohort analysis. However, the by far most up-todate estimates would have been obtained by period analysis. Although the period estimates have a higher standard error than the complete estimates due to restriction of the follow-up period included in the analysis, their mean squared difference from 5-year RSR later observed for new cancer patients would likewise have been smaller than that of the complete estimates for all cancer sites.

Given that the period estimates obtained in time intervals up to 1988–1992 tended to provide quite accurate (though still slightly too pessimistic) predictions of the 5-year RSR later observed for patients diagnosed in those time intervals, we also calculated the period estimates for the more recent 1993–1997 period, and we compared these estimates, which might be quite accurate (though still slightly too pessimistic) predictors of the (yet unknown) 5-year RSR of patients diagnosed in 1993–1997, with the corresponding cohort and complete estimates available in that interval (last three columns in Table 2Go). For many forms of cancer, the period estimates are again substantially higher than the traditional estimates, suggesting that there has been further recent improvement in prognosis that would so far remain undisclosed with application of traditional methods of survival analysis.

The 10-year RSR were generally somewhat lower than 5-year RSR for patients diagnosed in 1953–1957 (Table 3Go), but cancer specific mean annual increase over time was similar to that observed for 5-year RSR for most cancer sites.


View this table:
[in this window]
[in a new window]
 
Table 3 Ten-year relative survival rates (RSR) of patients diagnosed in 1953–1957 and 1983–1987, mean annual change in 10-year RSR over time, and mean difference and mean squared difference between 10-year RSR estimates available in 5-year intervals between 1963–1967 and 1983–1987 by different types of analyses and 10-year RSR later observed for patients diagnosed in those intervals. The three right-hand columns provide the most up-to-date cohort, complete and period estimates of 10-year RSR available in the 1993–1997 interval. Cancer sites are ranked according to improvement in 10-year RSR over time
 
The discrepancy between the most up-to-date cohort estimates of survival available in various time intervals and survival rates later observed for patients diagnosed in those time intervals is considerably more severe for 10-year RSR than for 5-year RSR. On average, it exceeds 10% units for melanoma and cancer of the urinary bladder and the nervous system, it comes close to 10% units for colon and prostate cancer. Again, the discrepancy could have been reduced by application of complete analysis, but by far the most up-to-date estimates could again have been obtained by period analysis (although even the period estimates were also quite pessimistic for those cancers with the largest improvement in prognosis over time). Again, the mean squared difference from observed 10-year survival rates was also considerably lower for period estimates than for cohort or complete estimates for all cancer sites.

For most forms of cancer, the most recent period estimates of 10-year RSR pertaining to the 1993–1997 interval are substantially higher than the most recent estimates from traditional survival analysis (last three columns of Table 3Go), again suggesting that there has been further major improvement in recent years. According to experience from the previous years, 10-year RSR of patients diagnosed in 1993–1997 will most likely be even somewhat higher than the most recent period estimates, whereas available traditional 10-year RSR estimates are likely to be exceedingly pessimistic in this situation.

Discussion

This analysis from the nationwide Finnish Cancer Registry illustrates that there has been tremendous improvement in long-term survival rates for many of the common forms of cancer in the second half of the 20th century. Traditional estimates of long-term survival rates would, on average, have severely lagged behind the survival rates later observed for newly diagnosed patients, thereby confronting patients and their physicians with overly pessimistic estimates of long-term survival rates. The problem is worse for 10-year than for 5-year survival rates because analysis of the former includes patients whose cancer has been diagnosed a longer time ago. Our analyses illustrate that the problem could have been strongly reduced (though not entirely eliminated) by period analysis of survival rates.

Both cohort and complete analysis of survival rates (and sometimes mixed forms of them) have been extensively used in the past.e.g.1–5 Compared to cohort analysis, results of complete analysis are not only less affected by random error (due to additional inclusion of more recently diagnosed patients who have not yet completed the entire follow-up time of interest by the time of analysis), but they are also somewhat more up-to-date. This was empirically shown for all sites of cancer in our analysis. We therefore suggest that complete analysis should be preferred over cohort analysis except for situations in which survival of specific cohorts of patients is to be evaluated retrospectively after completion of the entire follow-up time of interest.

To our knowledge, this is the first comprehensive evaluation of the use of period analysis for providing more up-to-date estimates of long-term survival rates of patients with cancer. With very few exceptions (pertaining to cancers with essentially no improvement in survival over time) we found that recency of survival estimates could be further improved by application of period analysis rather than complete analysis. The advantage in recency of period estimates is due to restriction of the analysis to the survival experience of patients during a recent calendar period. Although this restriction also increases random error of period estimates compared to complete estimates, the mean squared difference from the survival rates of newly diagnosed patients was also lower for the former than for the latter in all cases, even for the less common forms of cancer. We therefore recommend adding period analysis as a standard method of survival analysis in population-based cancer registration in order to obtain more up-to-date estimates of long-term survival. This is, of course, also conditional on the size of the population which is relatively large in Finland compared to many other areas of cancer registration.

The rationale for applying period analysis of survival is analogous to the widespread use of period life tables in the field of demography: to derive up-to-date estimates of life expectancy one commonly relies on period life tables which reflect mortality rates at various ages during some recent time period rather than on cohort life tables which reflect mortality rates of birth cohorts born a life span ago. In contrast to demography, the principle of period analysis has hardly been utilized in survival analysis so far.

In our analysis, we did not consider the reasons for the increase of survival rates. They include earlier diagnosis (or even overdiagnosis of subclinical lesions in some cases) as well as advances in therapy, which are likely to be applicable to a varying degree for the cancers included in this analysis.18 Period analysis would be able to disclose increases in survival rates regardless of their origin. More detailed analytical strategies, e.g. analysis by sex, age, stage at diagnosis, mode of detection and therapy, as well as more advanced techniques of analysis19 may often be useful for monitoring and explaining improvement in survival. Period analysis could be used for such strategies in the same way as shown for monitoring of overall survival rates in this paper.

Despite their advantages in recency over traditional estimates of long-term survival, the most up-to-date period estimates may still lag behind the survival rates achieved by newly diagnosed cancer patients in the case of rapid improvement of survival rates. In our examples, this was most evident for 10-year survival rates of patients with melanoma whose 10-year relative survival rates rose from 37.5% to more than 75% within 30 years. Nevertheless, the benefits of period analysis are also most salient in just these situations, in which underestimation by traditional methods of survival analysis is particularly bad.

Our analyses pertained to ‘simple’ descriptive estimates of survival only. We chose to do this because they are by far the most commonly reported survival estimates by cancer registries, and they can be applied and interpreted by a broad audience in a straightforward manner. More complex estimates based on modelling strategies might be considered to further enhance recency of long-term survival estimates, but this is beyond the scope of our paper.


KEY MESSAGES

  • Progress in cancer patient survival is disclosed with considerable delay by traditional methods of survival analysis.
  • A couple of years ago, a new method of survival analysis, denoted period analysis, was proposed to provide more up-to-date long term survival estimates.
  • This paper reports on a comprehensive empirical evaluation of this method using the database of the nationwide Finnish cancer registry.
  • The results confirm that period analysis consistently provides much more up-to-date estimates of long term survival rates than traditional methods of survival analysis.
  • It is concluded that period analysis should be implemented as a standard tool for providing up-to-date estimates of long-term survival by cancer registries.

 

In our analyses, we made the simplifying assumption that there was no delay in cancer registration, analysis and reporting of results. In practice, a delay of a few years is not uncommon which further reduces the recency of long-term survival figures. Therefore, efforts to implement methods for more up-todate survival analysis should go along with efforts to minimize such delay, such as yearly updates of analysis and reporting of survival figures. In the past, rapid publication and distribution of updated survival figures may have been prohibitively labour intensive and costly in many instances, but this should no longer be an issue given modern means of information dissemination.

In conclusion, we hope that our comprehensive empirical evaluation will encourage more widespread use of period analysis by cancer registries to derive more up-to-date long-term survival rates of cancer patients. Along with efforts to speed dissemination of results of survival analysis through the use of modern electronic media, the implementation of period analysis as a standard analytical tool for cancer registry data should help to overcome burdening cancer patients and their physicians with outdated and often overly pessimistic survival figures. It should also disclose, in a more timely way, further progress in cancer survival in the years to come.

Acknowledgments

The work of Timo Hakulinen was supported by the MaDaMe project of the Academy of Finland.

References

1 Berrino F, Sant M, Verdecchia A, Capocaccia R, Hakulinen T, Estève J (eds). Survival of Cancer Patients in Europe: The EUROCARE Study. IARC Scientific Publication 132. Lyon: IARC, 1995.

2 Berrino F, Capocaccia R, Estève J et al. (eds). Survival of Cancer Patients in Europe: The EUROCARE-2 Study. IARC Scientific Publication 151. Lyon: IARC, 1999.

3 Landis SH, Murray T, Bolden S, Wingo PA. Cancer statistics, 1999. CA Cancer J Clin 1999;49:8–31.[Abstract/Free Full Text]

4 Greenlee RT, Murray T, Bolden S, Wingo PA. Cancer statistics, 2000. CA Cancer J Clin 2000;50:7–33.[Abstract/Free Full Text]

5 Sankaranarayanan R, Black RJ, Parkin DM. Cancer Survival in Developing Countries. IARC Scientific Publication 145. Lyon: IARC, 1999.

6 Cutler SJ, Ederer F. Maximum utilization of the life table method in analyzing survival. J Chron Dis 1958;8:699–712.[Medline]

7 Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:457–64.[ISI]

8 Ederer F, Axtell LM, Cutler SJ. The relative survival rate: a statistical methodology. Natl Cancer Inst Monogr 1961;6:101–21.

9 Brenner H, Gefeller O. An alternative approach to monitoring cancer patient survival. Cancer 1996;78:2004–10.[CrossRef][ISI][Medline]

10 Brenner H, Gefeller O. Deriving more up-to-date estimates of long term patient survival. J Clin Epidemiol 1997;50:211–16.[CrossRef][ISI][Medline]

11 Brenner H, Stegmaier C, Ziegler H. Recent improvement in survival of breast cancer patients in Saarland/Germany. Brit J Cancer 1998;78: 694–97.[ISI][Medline]

12 Brenner H, Stegmaier C, Ziegler H. Trends in survival of patients with ovarian cancer in Saarland, Germany, 1976–1995. J Cancer Res Clin Oncol 1999;125:109–13.[CrossRef][ISI][Medline]

13 Brenner H, Hakulinen T. Long-term cancer survival achieved by the end of the 20th century. Most up-to-date estimates from the nationwide Finnish Cancer Registry. Brit J Cancer 2001; 85:367–71.[CrossRef][ISI][Medline]

14 Brenner H, Kaatsch P, Burkhardt-Hammer T, Harms DO, Schrappe M, Michaelis J. Long-term survival of children with leukemia achieved by the end of the 2nd millenium. Cancer 2001;92:1977–83.[CrossRef][ISI][Medline]

15 Teppo L, Pukkala E, Lehtonen M. Data quality and quality control of a population-based cancer registry. Experience in Finland. Acta Oncol 1994;33:365–69.[ISI][Medline]

16 Pukkala E. Use of record linkage in small-area studies. In: Elliott P, Cuzick J, English D, Stern R (eds). Geographical and Environmental Epidemiology: Methods for Small-area Studies. Oxford: Oxford University Press, 1992, pp.125–31.

17 Ederer F, Heise H. Instructions to IBM 650 Programmers in Processing Survival Computations. Methodological Note No. 10, End Results Evaluation Section. Bethesda (MD): National Cancer Institute, 1959.

18 Dickman PW, Hakulinen T, Luostarinen T et al. Survival of cancer patients in Finland. Acta Oncol 1999;38(Suppl.12):1–103.[CrossRef][ISI]

19 Berrino F, Micheli A, Sant M, Capocaccia R. Interpreting survival differences and trends. Tumori 1997;83:9–16.[ISI][Medline]