a Service de Biostatistique, Batiment 1M, Centre Hospitalier Lyon Sud, 165 Chemin du Grand Revoyet, 69495 Pierre-Benite, France.
b Current address: Amgen Ltd, 240 Cambridge Science Park, Milton Road, Cambridge CB4 0WD, UK.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Keywords Incidence, mortality, risk, demography, rate, cancer
Accepted 24 March 2000
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods have been used that do not adjust the population leading to incorrect results.1,2 Macfarlane et al.2 present the percentage change in the crude rate and the percentage change in the number of deaths for cancers of the upper aerodigestive tract in a variety of different countries. Their aim is to relate these changes to national consumption of alcohol and smoking. However, just looking at the crude rate does not provide a change in risk as structural changes have not been considered. Further they state, Numbers of deaths increased in some of the countries with decreasing rates, due to an ageing of the population, but no supporting analysis has been provided. Even if the rate decreases and the population size increases (e.g. doubles or triples) without affecting the structure then we could still see an increase in the absolute number of deaths and this would not be as a result of population ageing. In this situation we would recommend using the method we describe below to calculate the risk and demographic changes. In order to decide whether the structural changes are due to population ageing, one would need to study the population proportions more closely. Engeland et al.1 essentially use the same method as us but they do not adjust the populations to the same size. If the population size and structure does not change the lack of adjustment does not result in large errors. Hence the population size influences the changes due to risk and structure.
In this paper we present a method for partitioning the variation in the number of cases or deaths between two groups (or chronological dates) with respect to demographic variation on the one hand and differences in exposure to risk factors on the other. The demographic component of the variation has to be split itself into that due to variation in population size, which is trivial, and that due to the change in the population structure (i.e. age distribution) which needs more attention.
![]() |
Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
To eliminate the effect of the population size we start by adjusting the populations so that they comprise of the same number of people (100 000, say) but keep their specific age distribution, i.e. the proportion in each age group are the same as in the total population. This is equivalent to working on the crude rate instead of the number of cases.
Then, we have only to partition the difference in the crude rates between those due to differences in the population structures and those due to the differences in the risks. This is done by comparing the rates in the two groups to an intermediate rate obtained by applying the baseline age-specific incidence/ mortality rate to the age distribution of the comparison group.
![]() |
The first component on the right-hand side of equation (1) represents the proportional change in the crude rates due to differences in the population structure (i.e. age distribution) and the second component those due to differences in the risk.
The full algebraic formulation is given in the Appendix.
![]() |
Examples |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Table 1 shows the traditional age-specific data (grouped into 5-year intervals) for the number of observed deaths and population. It can be seen that the total number of deaths increases from 9876 to 15 258 between 1970 and 1980 (i.e. an increase of 5382 [54.5%] deaths). However, looking at the absolute number of deaths can be misleading as the population increased but not uniformly. The total population increased by 6.1%; by about 37% in the age group 8084 but it decreased in the age group 3549 years.
|
|
R2 is the set of expected age-group specific number of deaths in 1980 for a total population of 100 000. For example, for the 7074 age group (Table 1) we have 10.806 = (2843/883 300) x 3357. S2 is the sum of the age-specific R2s (i.e. S2 is the crude rate for 1980).
R3 is the set of expected age-group specific number of deaths in 1980 if the risk was the same as in 1970 for a total population of 100 000. For example, for the 6569 age group (Table 1) we have 7.699 = (2159/1 061 700) x 3786. S3 is the sum of the age-specific R3s (i.e. S3 is the age-standardized rate for the baseline group (1970) using the comparison group (1980) population as standard).
We are interested in splitting the difference between the total expected number of deaths in 1980 (S2) and 1970 (S1) for two populations of 100 000 (i.e. we want S2 S1). We can break the simultaneous change in rate and structure into two steps. Our baseline (or starting point) is the total expected number of deaths in 1970 in a population of 100 000.
Step 1 hold the rate constant and change the population to that of 1980. This gives us the total expected number of deaths in 1980 (for a population of 100 000) if the risk had not changed since 1970 (S3). The difference between this and the baseline is due to structural changes in the population (i.e. S3 S1).
Step 2 change the rate to that of 1980. This gives us the total expected number of deaths in 1980 for a population of 100 000. The difference between this and the first step is due to changes in risk (i.e. S2 S3).
Hence this means that the main interest focuses on the change in crude rate between 1970 and 1980 (i.e. S2 S1). It is more interesting to look at the proportion (or percentage) increase compared to the baseline of 1970 and to get this we simply divide by S1 (i.e. [S2 S1]/S1).
Table 3 (using the results from Table 2
) gives the breakdown in the change for lung cancer mortality in French males between 1970 and 1980. We can see that the crude rate increases by 18.2 per 100 000 population between 1970 and 1980, i.e. an increase of 45.5% ([18.2/39.0] x 100%). Of this, 17.5 (58.0 40.5) per 100 000 population was due to the change in risk, i.e. an increase of 43.8% ([17.5/39.0] x 100%) and 0.7 (40.5 39.8) due structural changes in the population, i.e. an increase of 1.7% ([0.7/39.8] x 100%). In terms of the absolute number of deaths, there was an increase of 5382 deaths of which 4327 (9876 x 43.8%) were due to the increased risk, 172 (9876 x 1.7%) were due to structural changes and the remaining 882 due to changes (i.e. an increase) in the size of the population.
|
Differences between geographical areas
Table 4 shows the number of observed deaths due to lung cancer for males in 1990 for Denmark and the United Kingdom (UK). Here it can be seen more clearly that looking at the increase in absolute number of deaths is not relevant as the population in the UK is about 11 times the size of that in Denmark.
|
|
|
|
![]() |
Graphical presentation |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Figure 1 is used to illustrate how we can present the differences under all possible scenarios. There are two situations with a total of six scenarios. In the first situation both the risk and demographic differences could be of the same sign (leading to two possible scenarios as represented by bars 1 and 2 in Figure 1
) or they could be of opposite signs (leading to four possible scenarios as presented by bars 36 in Figure 1
). Below we explain how to interpret the change as presented in Figure 1
(bearing in mind the comments about the demographic component above):
|
Bar 2 We have a net change of 50% in the crude rate (absolute number) of which 10% is due to demographic factors and 40% due to risk.
Bar 3 We have a net change of +30% in the crude rate (absolute number) of which 10% is due to demographic factors and +40% due to risk. Note that the +40% change due to risk is represented by the total length of the bar.
Bar 4 We have a net change of 10% in the crude rate (absolute number) of which +30% is due to risk and 40% is due to demographic factors. Note that the 40% change due to demographic factors is represented by the total length of the bar.
Bar 5 We have a net change of +30% in the crude rate (absolute number) of which 10% is due to risk and +40% due to demographic factors. Note that the +40% change due to demographic factors is represented by the total length of the bar.
Bar 6 We have a net change of 10% in the crude rate (absolute number) of which +30% is due to demographic factors and 40% due to risk. Note that the 40% change due to risk is represented by the total length of the bar.
Further if the bars did not have the net change indicated, it would not be possible to work out whether Bars 3 and 4, and Bars 5 and 6 represented an overall positive or negative difference from the graphs alone. A marker must be added to indicate the end point of the sum of the two components (i.e. the net change). We could use a thick line for this purpose instead of giving the net change in figures.
Bars 3 to 6 can be difficult to interpret initially but if one thinks of starting at zero and then going in the opposite direction of the net change this simplifies the interpretation. So, for example, if we look at bar 4 again: we start at zero and firstly there is a 30% increase in risk (i.e. going in the opposite direction to the net change) which is offset by a 40% change in demographic factors (i.e. now starting at +30% and going down by 40%) leading to a net change of 10%.
Example
We use the results from Differences between time points and Differences between geographical areas described earlier under Examples to illustrate the use of these graphs (Tables 3 and 5). We will only look at the differences in the crude rate and hence the difference due demographic factors will be due to the population structure only. The results are shown in Figure 2
.
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Care must be taken in presenting the differences in disease incidence or mortality as we can present this difference in terms of the crude rate or absolute number. Although we would advocate the use of the former it is very difficult to get away from the latter. From the public health perspective the absolute number is more useful. However, we must be careful not to misrepresent the true situation if there is a large difference between the two comparison populations. For example, it is quite plausible that the comparison population is many times larger than the baseline population (even between two time points, e.g. developing countries) but at the same time the risk is decreasing. Here we could still see an increase in the absolute number but we should be cautious as more importantly the risk is decreasing.
We also proposed a graphical method for the presentation of the two components (i.e. risk and demographic) using bar charts. However, we would advocate the use of tables ahead of any graphical presentation but at the same time concede there are probably situations in which one needs to present the information graphically.
In this paper we used the raw data in quantifying the difference in lung cancer mortality with respect to demographic and risk factors. However, we would recommend that the data are smoothed before doing such analysis. For example, before comparing two time points we could have done an analysis of time trends using age-period-cohort modelling.4.5 Here the comparison would have been made using the fitted values. Similarly, one could do some form of spatial smoothing before comparing two geographical points.6
Looking at two points may not be useful or it may not paint a clear picture of what is actually happening. We would suggest that multiple comparisons are made to see how the demographic and risk factors change. For example, looking at our example of lung cancer in French males, it would be more informative to look at the evolution of change. Figure 3 and Table 8
show the yearly change between 1970 and 1990 (using 1970 as a baseline). Here we can see that the risk increases faster between 1970 and 1978 compared to between 1979 and 1990. An important feature that would have been missed is that the gap between the net change and the change due to risk is increasing. This means that the changes due to the population structure are increasing (as can be seen from Figure 3
and Table 8
). We could have also presented these results as described in Graphical presentation.
|
|
![]() |
Appendix |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Assume that we have two comparison groups 1 and 2 for which we have the incidence (or mortality) and populations data by age groups. Further we assume that there are an equal and corresponding number of age groups for both comparison groups, say N.
Let us assume that we have C1 and C2 cases/deaths and a total population of P1 and P2 in groups 1 and 2, respectively. The relative difference between C1 and C2 can be expressed as
![]() |
![]() |
![]() |
This quantity (i.e. [S2/S1] 1) is split into a component due to the differences in risk and a component due to differences in the population structure (i.e. age distribution).
Let use define
ix rate in age group x for group i(2)
ix proportion of population in age group x for group i(3)
where i is 1 or 2. We will be using group 1 as the baseline group for comparison.
We are interested in analysing the difference in the crude rate between groups 1 and 2. Let S1 and S2 be the crude rates in comparison group 1 and 2, respectively. So we want to analyse the relative difference in the crude rate, i.e.
![]() |
in terms of risk and population structure (i.e. age distribution). Using (2) and (3), let S3 be x
1x
2x that is the rate in group 1 applied to the population proportion in group 2. We have
![]() |
![]() |
The first component on the right-hand side represents the proportion of the difference in the crude rate between groups 1 and 2 due to the differences in the population structure and the second component represents the proportion due to differences in the risk.
![]() |
Acknowledgments |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
2 Macfarlane GJ, Macfarlane TV, Lowenfels AB. The influence of alcohol consumption worldwide trends in mortality from upper aerodigestive tract cancers in men. J Epidemiol Community Health 1996;50:63639.[Abstract]
3 Breslow NE, Day NE. Statistical Methods in Cancer Research, Vol. II: The Design and Analysis of Cohort Studies. Lyon: IARC Scientific Publications, 1987.
4 Clayton D, Schifflers E. Models for temporal variation in cancer rates. I: Age-period and age-cohort models. Statistics in Medicine, 1987;6:44967.[ISI][Medline]
5 Clayton D, Schifflers E. Models for temporal variation in cancer rates. II: Age-period-cohort models. Statistics in Medicine, 1987;6:46981.[ISI][Medline]
6 Estève J, Benhamou E, Raymond L. Statistical Methods in Cancer Research Volume IV: Descriptive Epidemiology. Lyon: IARC Scientific Publications, 1994.