a Copenhagen County Centre for Preventive Medicine, Medical Department M, Glostrup University Hospital, DK-2600 Glostrup, Denmark.
b Medical University of South Carolina, Biometry and Epidemiology, Charleston, USA.
Troels F Thomsen, Centre for Preventive Medicine, Medical Department M, Glostrup University Hospital, Building 8, 7th floor, DK-2600 Glostrup, Denmark. E-mail: trth{at}glostruphosp.kbhamt.dk
Abstract
Background Due to marked regional differences in the incidence of coronary heart disease (CHD) in Europe, the recommendation by the European Society of Cardiology to use the Coronary Risk Chart based on data from the Framingham Heart Study, could be questioned.
Methods Data from two population studies (The Glostrup Population Studies, n = 4757, the Framingham Heart Study, n = 2562) were used to examine three different levels of cross-validation. The first level of examination was whether a risk-score developed from one sample adequately ordered the risk of participants in the other sample, using the Area Under a Receiver Operating Characteristic (AUROC) curve. The second level compared the magnitude of coefficients in logistic models in the two studies; while the third level tested whether the level of risk of CHD death in one sample could be estimated based on a risk function from the other sample.
Result Coronary heart disease mortality was 515 per 100 000 person-years in Framingham and 311 per 100 000 person-years in Glostrup. The AUROC curve was between 75% and 77% and regardless of which risk-score was used. Logistic coefficients did not differ significantly between studies. The Framingham risk-score significantly overestimated the risk in the Glostrup sample and the Glostrup risk-score underestimated in the Framingham sample.
Conclusion Using this Framingham risk-score on a Danish population will lead to a significant overestimation of coronary risk. The validity of risk-scores developed from populations with different incidence of the disease should preferably be tested prior to their application.
Keywords Risk-score, validation, coronary heart disease mortality
Accepted 1 February 2002
The main purpose of coronary risk-scores is to assist the clinician in identifying those patients at highest level of coronary risk, reserving preventive measures for those individuals above a specified coronary risk. The guidelines from The European Society of Cardiology on Primary Prevention of Coronary Heart Disease (CHD) recommend that the Coronary Risk Chart based on a Framingham risk-score is used in Europe for the estimation of individual level of coronary risk.1 If the estimated risk exceeds 20% over a 10-year period, risk reducing treatment should be initiated. However, the applicability of this Framingham risk function to a low-risk population was recently questioned. It was thus shown that the Framingham risk function markedly overestimated the level of coronary risk in an Italian population where the incidence of coronary events is one-third of the incidence in Framingham (220 versus 627 per 100 000 person-years).2,3 Although the guidelines assume that the Framingham function predicts the level of risk reasonably well in high-risk populations,1 there is almost no evidence to support the use of this function in northern European populations. The mortality rates from ischaemic heart disease in Denmark are approximately twice the rates of Italy (for men: 423 versus 224, for women: 145 versus 65 per 100 000 person-years) and it may therefore be questioned whether applying a Framingham risk-score to a Danish population also will lead to an overestimation of individual coronary risk.2
Material
Our purpose was not to validate one particular risk function from Framingham, but rather to ask whether the risk functions derived from one sample would be adequate for use in the other sample, when similar methodologies were applied to samples from Framingham and Glostrup. To do this, we derived risk functions from primary data from the two studies (Framingham and Glostrup) that were available to us at the time the analysis was conducted.
The Framingham Heart Study
The Framingham data for the present analysis stem entirely from the original cohort examined during the lipoprotein phenotyping project that corresponded approximately to the eleventh examination cycle (1971). At this examination 2788 individuals participated and it was the first of the Framingham cohorts that included lipid determinations, other than total cholesterol, for the participants. For some participants without records of smoking status at this examination, status at the next earliest examination was used; and for a small proportion of this group, the examination coincided with the 10th or 12th biennial examination of the cohort.
Since a complete-case analysis was conducted, 107 participants with unknown values for at least one of the characteristics under consideration were excluded prior to analyses. We also excluded from the analysis all participants who previously had experienced a myocardial infarction (n = 119), leaving 2562 participants in our analysis. The analytical sample utilized is half of the sample used for the derivation of the most recently published Framingham risk function. That is, the recently published risk functions from Framingham are derived from a sample that included a pool of the data from the original Framingham cohort used here as well as data from the Framingham Offspring Study.4,5
The Glostrup Population Studies
The Glostrup sample is a pool of five observational cohorts from The Glostrup Population Studies: (1) a cohort of individuals born in 1914 (n = 804, examined in 1984), (2) a cohort of individuals born in 1922, 1932, 1942, and 1952 (DAN-MONICA 1, n = 3785, examined in 1983), (3) a cohort of individuals born in 1926, 1936, 1946, and 1956 (DAN-MONICA 2, n = 1416, examined in 1983), (4) a cohort of individuals born in 1921, 1931, 1941, 1951 and 1961 (DAN-MONICA 3, n = 2026, examined in 1992), and finally (5) a cohort of individuals born in 1918, 1928, 1938, and 1948, (n = 928, examined in 1978). The Glostrup Population Studies have been described previously.6 As every person in the Danish population is identified by a unique registration number, linkage of individual information over time as well as linkage with national health registers is highly accurate. The pooled cohort covers a wide age range (3070 years), but since we wished to compare Glostrup and Framingham over a similar age range, anyone less than 49 years was not included in these analyses. Eighty-three participants with at least one unknown value were furthermore excluded from the analysis, and all patients who previously had experienced a myocardial infarction were also excluded (n = 178), leaving 4757 individuals from the Glostrup cohort in this study.
Endpoint
Endpoint in this validation was CHD mortality (ICD-8 codes 410414). Mortality rather than morbidity data were used as the endpoint since the former were assumed to be more comparable between countries than the latter. Cause of death in the Framingham cohort was determined by a panel review of death certificates and other documentation available to study investigators while in the Glostrup cohort the national death certificate for underlying cause of death was used. The follow-up period was fixed at 10 years, thus only those participants dying of CHD within 10 years were considered to be events.
Risk factors
The following variables were included in the analyses: sex, age, serum total cholesterol and high density lipoprotein (HDL) (mg/dl), smoking (self-reported: non versus current), systolic blood pressure (mmHg), and diabetes. All measurement techniques were comparable, however diabetes was established in Glostrup by the question Has a doctor ever told you that you had diabetes? In Framingham it was defined as a random glucose >9 mmol/l and/or the use of diabetic treatment.7
Methods
The basic principle in the analysis was to fit a logistic regression model based on one cohort and then apply this model to the second cohort to obtain predicted probabilities for each member of the second cohort. Then, these predicted probabilities of CHD death were compared to whether the person actually died from CHD. All analyses were repeated, reversing the roles of the two cohorts, using the second cohort to fit the model and examining its validity when applied to the first cohort. All analyses were also repeated including people with existing CHD. The length of follow-up was fixed to 10 years. The cross-validation procedure progressed over three levels.
Level 1
Our first examination used the Area Under the Receiver Operating Characteristics (AUROC) curve. The AUROC curve measures the proportion of case/non-case pairs that are correctly ordered.8 The method takes, however, no account of the actual prevalence of the disease that is tested for.
Level 2
The second examination compared the magnitude of the coefficients in a logistic model predicting CHD death in each of the cohorts. To compare the magnitude of the coefficients estimated in the logistic models, we used a Wald statistic to test whether coefficients differed in the two studies.
Level 3
This was the core analysis in which we examined (1) the observed and predicted number of cases and (2) the refinement (spread) and the calibration (accuracy). To calculate the predicted number of deaths each participant of the Glostrup cohort was assigned a probability of CHD death based on the Framingham logistic. The sum of these estimated probabilities is the predicted number of deaths in Glostrup based on the Framingham risk-score. The refinement and the calibration were analysed using the methods described by Miller et al.9 in a two-step procedure. Initially, a logistic regression analysis predicting CHD death in one study was used to estimate the probability of CHD death in the other cohort. These predicted probabilities were then transformed to logits; these estimated logits were used as the only covariates in a logistic regression model predicting CHD in the second study. In the first step we tested whether the coefficient (ß) associated with the predicted logit was one. If the coefficient (ß) could be assumed to be one, we inferred that the spread of the risk estimates was correct and a second logistic analysis was conducted in which the coefficient of the predicted logit was fixed at 1. In this model we then tested if the constant term () was zero, i.e. a test of
= 0 given ß = 1. If the constant term was not zero, we inferred that the level of probabilities estimated was not correct (a negative value indicated over-prediction and a positive value indicated under-prediction).
Results
Table 1 presents baseline characteristics of the Framingham and Glostrup cohorts. The Framingham sample is slightly older and has a higher average systolic blood pressure, and a higher prevalence of diabetes than the Glostrup sample. On the other hand, the Glostrup sample contains a higher proportion of males and smokers, and has, on average, higher serum cholesterol and higher HDL than the Framingham sample. All these differences are statistically significant.
|
|
|
|
|
Discussion
While there are significant differences in the general risk profile between Glostrup and Framingham, the difference in relative risk between the two studies remains insignificant. This finding regarding the consistency of the relative risk estimations has been found before.10,11 Nonetheless, the magnitude of some of the coefficients differs somewhat. For instance, the risks associated with having diabetes tended to be lower in Framingham than in Glostrup. Although this finding may be accidental, it was unexpected since the Framingham diabetes cases are more likely to be true cases given that they are based on a medical review rather than self-report. However, the coefficient for smoking is larger in Glostrup although smokers were based on self-report in both samples.
Using cardiovascular death as the endpoint may raise some methodological questions. In Framingham, cardiovascular death was determined by a panel review of death certificates and other documentation available to the study investigators, while in Denmark determination was solely by death certificate. However, the bias introduced by using death certificates only, should be an overestimation of the true incidence of cardiovascular death in Glostrup since the diagnosis ischaemic heart disease is likely to be given too frequently as the cause of death.12 The observed difference between the two studies may therefore be even larger. The problem itself may also be of minor importance since the majority of cardiovascular deaths in Denmark occur within hospital and thus are more likely to be reviewed by one or more doctors. Finally, the mortality rates in Glostrup have been found in general to be similar to the rates in Denmark at large.13
The lack of difference in the ranking of individuals, using either the Framingham or the Glostrup model, implies that those at highest risk would be identified independently of the model that has been chosen. The proportion of individuals in Glostrup that would be estimated to be above the cut-point of 20% risk within 10 years will, however, be larger if a Framingham model is used than if a Glostrup model was used. The Framingham risk-score produced in this study, in general, overestimates the number of cases in the Glostrup sample and the Glostrup risk-score underestimates the number of cases in the Framingham sample. This is probably reflecting the higher incidence in Framingham, with a CHD mortality rate approximately 60% higher than the Glostrup population. Using the Coronary Risk Chart in Denmark may thus lead to an overestimation of risk and thereby a possible over-treatment with e.g. statins. This might have significant impact on national healthcare expenditures.
The problem of the validity of risk-scores developed from other population samples than the one they are applied to has been examined in many different settings.
An early paper by Keys et al.14 compared risk factors reported by several studies with data from four of the Pooling Project Studies of the American Heart Association15 (Pool 4) together with samples from the US Railroad Workers Study and the International Cooperative Study on Cardiovascular Epidemiology.16 The investigators concluded that the ordering of participants was similar for the two functions. The magnitudes of the total number of expected cases, however, differed significantly, with the international coefficients under-predicting the American cohorts.
The question on how much relative risk differs between populations has been investigated several times. Gordon et al.17 compared CHD rates for Framingham, Honolulu, and Puerto Rican study populations. The researchers concluded that the relative odds for Framingham, at the average values for risk factors, was about twice that of the other studies. Menotti et al.18 has compared coefficients from several cohorts (Seven Countries Study, the Italian RIFLE project and MRFIT) concluding that, with some exceptions, the coefficients of the participating studies were similar.
The application of one risk score model to another population has also been examined to some extent. Kozarevic et al. compared samples from urban and rural areas participating in the Yugoslavia Cardiovascular Diseases Study to a contemporaneous cohort from Framingham. Multivariate logistic function coefficients were estimated and comparisons conducted. The researchers found a threefold increase in risk for Framingham over both urban and rural Yugoslav populations. Brand et al.19 compared earlier findings from the Framingham study to those of the Western Collaborative Group Study (WCGS), using published Framingham logistic coefficients to calculate probabil-ities of an event in WCGS. The investigators concluded that the risk calculated according to Framingham correlated well with actual risk. McGee et al.20 analysed data from five of the American Heart Association Pooling Project Studies15 to determine whether Framingham results could be used to predict CHD and death among these five studies. Goodness-of-fit statistics were calculated using the Framingham and specific-study models, and the fits for all six models were similar. Finally, Leaverton et al.21 used the first 10-year follow-up of the NHANES I cohort to examine whether the Framingham risk model for CHD mortality could be applied to other studies. They observed an increase in level of risk according to decile of risk, regardless of the study from which the model was derived.
In Denmark the Framingham coefficients have previously been compared with coefficients from one cohort in the Glostrup Population Studies (the cohort of individuals born in 1914).22 It showed that the Framingham coefficients generally tended to be higher than the coefficients from Glostrup. This is in line with this pooled analysis. The Copenhagen City Heart Study tested the Framingham Stroke risk-score and found that, in spite of similar stroke probabilities based on a point system from the two studies, a prognostic index could not be recommended for individual prediction because of large statistical uncertainty.23
None of these above-mentioned comparisons involved a systematic framework for judging the validity of the risk functions. Since original data were used in this study it was possible to control for e.g. differences in age and risk factor distribution which made it possible to examine the different dimensions of validity like the ordering of the risk estimates, calibration and refinement.
Conclusion
The results of this analysis showed that a risk-score developed from a population with high risk ordered the individuals correctly when applied to a population with medium risk and vice versa. The relative risks in the two models did not differ significantly from each other. However, probably due to the differences in the incidence of the disease, the risk-score based on Framingham Heart Study predicted a significantly higher level of risks when applied to the Danish population and the Glostrup risk-score underestimated on Framingham. This suggests that using the Coronary Risk Chart on a Danish population, with a definition of high risk as above 20%, may reserve treatment for those at highest level of risk, but the risk among those treated will not necessarily be 20% or greater.
KEY MESSAGES
|
Acknowledgments
This work was partially funded by grants from NIH, HL 61769 and The Danish Heart Foundation 2-F-22518, respectively. Data from the Framingham Heart Study were obtained from the National Heart, Lung, and Blood Institute. The views expressed in this paper are those of the authors and do not necessarily reflect the views of this agency.
References
1 Wood D, De Backer G, Faergeman O, Graham I, Mancia G. Prevention of coronary heart disease in clinical practice. Atherosclerosis 1998;140: 199270.[CrossRef][ISI][Medline]
2 Menotti A, Puddu PE, Lanti M. Comparison of the Framingham risk function-based coronary chart with risk function from an Italian population study. Eur Heart J 2000;21:36570.
3 Sans S, Kesteloot H, Kromhout D. The burden of cardiovascular diseases mortality in Europe. Task Force of the European Society of Cardiology on Cardiovascular Mortality and Morbidity Statistics in Europe. Eur Heart J 1997;18:123148.[ISI]
4 Anderson KM, Wilson PWF, Odell PM, Kannel WB. An updated coronary risk profile. A statement for health professionals. Circulation 1991;83:35662.[ISI][Medline]
5 Wilson PW, DAgostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of coronary heart disease using risk factor categories. Circulation 1998;97:183747.
6 Schroll M, Jorgensen T, Ingerslev J. The Glostrup Population Studies, 19641992. Dan Med Bull 1992;39:20407.[ISI][Medline]
7 Kannel WB, McGee DL. Diabetes and glucose tolerance as risk factors for cardiovascular disease: the Framingham study. Diabetes Care 1979; 2:12026.[Abstract]
8 Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:2936.[Abstract]
9 Miller ME, Hui SL, Tierney WM. Validation techniques for logistic regression models. Stat Med 1991;10:121326.[ISI][Medline]
10 Tunstall-Pedoe H, Kuulasmaa K, Mahonen M, Tolonen H, Ruokokoski E, Amouyel P. Contribution of trends in survival and coronary-event rates to changes in coronary heart disease mortality: 10-year results from 37 WHO MONICA project populations. Monitoring trends and determinants in cardiovascular disease. Lancet 1999;353: 154757.[CrossRef][ISI][Medline]
11 van den Hoogen PC, Feskens EJ, Nagelkerke NJ, Menotti A, Nissinen A, Kromhout D. The relation between blood pressure and mortality due to coronary heart disease among men in different parts of the world. Seven Countries Study Research Group. N Engl J Med 2000; 342:18.
12 Juel K, Sjol A. Decline in mortality from heart disease in Denmark: some methodological problems. J Clin Epidemiol 1995;48:46772.[CrossRef][ISI][Medline]
13 Andersen LB, Vestbo J, Juel K et al. A comparison of mortality rates in three prospective studies from Copenhagen with mortality rates in the central part of the city, and the entire country. Copenhagen Center for Prospective Population Studies. Eur J Epidemiol 1998;14:57985.[CrossRef][ISI][Medline]
14 Keys A, Aravanis C, Blackburn H et al. Probability of middle-aged men developing coronary heart disease in five years. Circulation 1972;45: 81528.[ISI][Medline]
15 The Pooling Project. The Final Report of The Pooling Project. J Chronic Dis 1977;31:201306.[CrossRef][ISI]
16 Keys A, Aravanis C, Blackburn HW et al. Epidemiological studies related to coronary heart disease: characteristics of men aged 4059 in seven countries. Acta Med Scand Suppl 1966;460:1392.[Medline]
17 Gordon T, Garcia-Palmieri MR, Kagan A, Kannel WB, Schiffman J. Differences in coronary heart disease in Framingham, Honolulu and Puerto Rico. J Chronic Dis 1974;27:32944.[ISI][Medline]
18 Menotti A, Keys A, Blackburn H et al. Comparison of multivariate predictive power of major risk factors for coronary heart diseases in different countries: results from eight nations of the Seven Countries Study, 25-year follow-up. J Cardiovasc Risk 1996;3:6975.[Medline]
19 Brand RJ, Rosenman RH, Sholtz RI, Friedman M. Multivariate prediction of coronary heart disease in the Western Collaborative Group Study compared to the findings of the Framingham study. Circulation 1976;53:34855.[Abstract]
20 McGee D, Gordon T. The results of the Framingham Study applied to four other US based studies of cardiovascular disease. In: Kannel WB, Gordon T (eds). The Framingham Study. An Epidemiological Investigation of Cardiovascular Disease. DHEW Publication No. (NIH) 761083, 1976.
21 Leaverton PE, Sorlie PD, Kleinman JC et al. Representativeness of the Framingham risk model for coronary heart disease mortality: a comparison with a national cohort study. J Chronic Dis 1987;40:77584.[ISI][Medline]
22 Schroll M, Larsen S. A ten-year prospective study, 19641974, of cardiovascular risk factors in men and women from the Glostrup population born in 1914. Multivariate analyses. Dan Med Bull 1981; 28:23651.[ISI][Medline]
23 Truelsen T, Lindenstrøm E, Boysen G. Comparison of probability of stroke between the Copenhagen City Heart Study and the Framingham Study. Stroke 1994;25:80207.[Abstract]