1 Department of Internal Medicine, Institut Jules Bordet, Bruxelles, Belgium; 2 Service of Pneumology and Thoracic Oncology, CHU Calmette, Lille, France
Received 28 August 2002; revised 12 December 2002; accepted 19 December 2002
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Citation factors are applied to assess scientific work despite the fact that they were developed commercially in order to compare competing journals. The aim of the present study was to determine whether there is a relationship between citation factors and a trials methodological quality using published randomised trials in lung cancer clinical research.
Material and methods
All of the randomised trials included in nine systematic reviews performed by the European Lung Cancer Working Party (ELCWP) were assessed using two quality scales (Chalmers and ELCWP).
Results:
One hundred and eighty-one articles were eligible. The median overall ELCWP and Chalmers quality scores were 61.8% and 49.0%, respectively, with a correlation coefficient (rs) of 0.74 (P <0.001). A weak association was observed between citation factors and quality scores with the respective correlation coefficients ranging from 0.18 to 0.40 (ELCWP scale) and from 0.21 to 0.38 (Chalmers scale). American authors published trials significantly more often in journals with high citation factors than European or non-American authors (P <0.0001), despite no better methodological quality. Positive trials, which were significantly more likely to be published in journals with higher citation factors, were of no better quality than negative ones.
Conclusion:
Journals with higher citation factors do not appear to publish clinical trials with higher levels of methodological quality, at least for trials in the field of lung cancer research.
Key words: bibliometry, citation factor, eurofactor, impact factor, lung cancer, prestige factor
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Editors are sometimes tempted to artificially increase the IF of their journal in several different ways, such as self-citation, incorporation of review articleswhich are cited more often than original reports [2, 5]or the selection of authors on the basis of previous publications, publishing those with a higher potential impact on citation [1]. As the number of references allowed in a journal article may be limited, authors refer more often to review articles, and as a result reviews generally have higher citation rates than original articles. In order to avoid some of the artificial increases in IFs, the prestige factor (PF) was developed, which only takes into consideration original articles and excludes review articles. Also, ISI includes in its database about 3200 journals from an estimated total of 126 000 published worldwide, with a bias towards English language journals and English-speaking authors [6]. To reduce discrimination, Europeans developed their own citation factor, the eurofactor (EF), which compares only European journals.
Citation factors were developed with the initial purpose of comparing competing journals in a commercial way. Later, the assumption was made that the journal is representative of the articles published in it and consequently that citation factors could represent a way of assessing the quality of the scientific work. In certain countries, IFs have been used to select individuals for academic promotions or to determine resource allocation [5]. However, the contribution of individual articles to the global IF of a journal is often disproportionate. The few most cited articles account for the majority of citations [6]. The aim of the present work was to determine whether there is a relationship between methodological quality, as assessed by specific quality scores, and various citation indexes (IF, PF and EF) in lung cancer randomised trials.
![]() |
Materials and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Articles were retrieved from a Medline search, completed by the references known by the authors or found in the articles. They must be related to the particular topic developed in each systematic review, include only lung cancer patients and be published in English, French or Dutch. Only fully published articles were considered. When trials were published more than once, each article was considered for this review and separately qualitatively assessed by the investigators, with the exception of articles reporting preliminary results without survival (toxicity analysis) or when updated results were presented in the form of a summary.
A team of medical doctors and a biostatistician carried out the methodological quality evaluation. Consensual agreement on the scores attributed to each item for each trial was obtained during regular meetings requiring the presence of at least 75% of the investigators. The methodological quality of all randomised trials was assessed using two previously published quality scales, ELCWP [715] and Chalmers [16] (Appendices A and B). Scores are expressed as a percentage of the maximal theoretical value that can be obtained for the considered category. Higher scores reflect the better methodological quality of the trial.
Three citation indexes were used: the IF, the PF and the EF. For IF, the value obtained from the last 1999 version available to the authors (IF99) and the value of IF at the date of publication of the article (IF Pub) were considered. IF Pub was researched by the librarian of the Institut Jules Bordet, by contacting the journals direct or by requesting the information from other scientific libraries. PFs and EFs were obtained from their respective web sites: www.prestigefactor.com and www.vicer.org.
The result of a trial was considered to be conclusive if the statistical test comparing the different study arms, according to the primary objective as defined by the authors, gave a value of P <0.05 in favour or not of the experimental arm. If this was the case, the trial was termed positive; otherwise, it was classified as negative. When no primary objective was clearly defined, we considered survival as the main study endpoint.
Statistics were performed using the software Statistica (Statsoft, Tulsa, OK). Citation factors were treated as continuous variables. The association between two continuous variables was measured by the Pearson correlation coefficient or by the Spearman rank correlation coefficient, if the assumption of Gaussian distributions was not valid. Non-parametric MannWhitney (for dichotomic variables) and KruskalWallis (for nominal variables with multiple classes) tests were done to compare the distributions of continuous variables, in cases of non-Gaussian distributions, according to the value of a discrete variable. When the continuous variable presented with a normal distribution, Students t-test was carried out to compare its distribution according to the value of a discrete variable. Chi-square statistics were used to assess the relationship between two dichotomic variables. Because of the multiple tests that we performed, we considered the difference between two comparisons as statistically significant for values of P <0.001.
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Quality scores
The overall median ELCWP and Chalmers quality scores were 61.8% (range 20.689.4) and 49% (range 986), respectively. A good correlation between the scores was observed with an rs of 0.74 (P <0.001). Quality scores were significantly correlated with year of publicationmore recent trials had better scores. The respective rs were 0.44 (P = 0.001) for ELCWP and 0.51 (P <0.0001) for Chalmers. A weak correlation between the scores and the number of eligible patients per article was found (rs = 0.19; P = 0.01). The results are summarised in Table 1.
|
|
To avoid the possibility of non-representative samples, the same analysis was carried out after the exclusion of journals reporting less than four articles, leaving 11 journals for comparison. These journals were classified in descending order according to their methodological quality scores and their citation factor values for comparison (Table 3). The respective values of the correlation coefficient between the citations factors and the ELCWP and Chalmers scores were as follows: IF 99, 0.18 (P = 0.03) and 0.18 (P = 0.03); IF Pub, 0.26 (P = 0.004) and 0.29 (P = 0.001); PF, 0.13 (P = 0.12) and 0.17 (P = 0.04); and EF, 0.53 (P = 0.004) and 0.38 (P = 0.05), respectively. This absence of correlation is described in Table 3. The position of the journals when classified according to quality score is not equivalent with the citation factor ranking.
|
|
|
|
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The use of citation factors as a quality measure implies that the quality of published articles is a direct function of the value of a journals citation factor. Although in many of the journals the selection of articles is part of a peer-reviewed process, it is dependent on two or three individuals whose decision is not only a function of the quality of the work, but may also be influenced by subjective arguments. For example, when the authors know the name of the reviewer, peer-review is of a higher quality than for unsigned forms [18]. Also, editors can interfere with this process for non-scientific reasons, so that controversial publications, without presumption of their quality, can be selected because they lead to a substantial increase in the number of citations. Some editorial policies can artificially increase the value of citation factors without having any positive impact on the quality of the publications; for example, by choosing well recognised authors that are more likely to be cited, by asking the authors to select, in their references, articles published in the journal (self-citation), by including more review articles or by modifying the number of published articles each year.
Frequently, citation factors are misused for academic work assessment and determination of resource allocation. When citation factors are used in this way, it is assumed that they are a direct representation of the intrinsic quality of a trial and that articles edited in the same journal are of similar quality. Nevertheless, citation factors do not represent the individual citation rate of a paper given that only a few articles account for the majority of citations [2, 6], and uncited articles have the same IF as frequently cited ones. Moreover, in journals with a large number of articles in each issue, the quality of the articles is more variable and the frequency of citation is reduced, lowering the IF. Also, the value of citation factors is dependent on the research field, which requires similar units to be applied when IFs are used for comparison. Thus, not only does the number of articles in a journal and their source have to be taken into account when scientific work assessment is performed, but also the methodological quality and their scientific relevance have to be taken into consideration.
Journals with higher citation factors do not systematically publish articles with the highest quality scores. There are some explanations for this, which reflects a potential publication bias. We found that statistically significant trials are published more often in journals with high IFs, even though their methodological quality is equivalent to non-significant studies. It was previously reported that negative trials are published more often in the native authors language [19]. But, in the ISI database, there is a bias towards journals published in English; for example, only two German social science journals are included in the ISI database in comparison with 542 in a German database [6]. Therefore, non-significant trials can be published in journals that are not included in the citation factors analysis. For different reasons, authors may be attracted by journals with high citation factors, which can be more interested in publishing positive trials with a higher potential of citation than negative studies.
Why do American authors choose to publish their work in journals with statistically higher citation factors than European or non-American authors? Neither significance of the trial results nor better methodological quality were sufficient explanations for such a difference; it is probably due to biases in the system of calculating citation factors. In the ISI database, more than half of all citations concern American scientists (language bias). Furthermore, American authors are particularly prone to cite each other, raising the citation rate of American science 30% above the world average [6]. Also, non-American authors can have difficulties in publishing trials in American journals, partly because English is not their native language. So, the apparent American leadership in this field is partly due to the high citation factors of the journals in which they choose to publish, to the self-citation rates and the national citation bias; however, it does not reflect the intrinsic quality of their scientific work, at least in the case of lung cancer research [5].
Citation factors have been created with a commercial purpose in mind and this needs to be preserved. They can be useful to authors when choosing which journal should publish their work, based on the distribution of the journal, assessed by the citation factors, and its ability to deliver the information. However, the reader must take note that publishing in journals with high citation factors is not a guarantee of scientific quality. When analysing a trial, one needs to take into consideration not only the journal where the study was published, but also its scientific relevance and its intrinsic quality, which are not measured by citation factors.
There are some limitations when generalising the results of our study outside of the lung cancer field. To determine whether there is a correlation between citation factors and the methodological quality of articles, we considered articles reporting studies performed in a same clinical research subfield, avoiding the potential bias of comparisons between different scientific domains. In lung cancer treatment research, mainly small to moderate size randomised trials with equivocal results (level of evidence II) are available. The observed effects are small, needing meta-analyses to determine statistically significant differences between treatment arms. Extrapolation of the conclusions drawn from this study to large prospective randomised trials (level of evidence I), including thousands of patients, as is the case in the field of cardiac research, must be performed with circumspection.
We tried to avoid other potential methodological biases. We considered only randomised studies to keep some homogeneity in trial design. All studies were extracted from previous systematic reviews performed by the ELCWP, for which articles were selected after an extensive review of the literature. All lung cancer randomised trials, published on the specified topics, in English, French or Dutch were eligible. Studies published in other languages are not included in this analysis, but the majority of articles published in non-English language journals are not included in the ISI database [5], so it seems unlikely that including them would change our results. Quality analysis was performed using two different quality scales, Chalmers and ELCWP. The use of two scales, read independently by different investigators with consensual agreement on each item in regular meetings (in which the majority of the authors needed to be present) and the multiplicity of the items allowed consistency in the quality assessment.
The classical P value threshold of 0.05 was not used in this study because of the multiplicity of tests that were performed. We decided to apply a P value threshold of 0.001 (0.05 divided by the number of statistical tests) to reduce the risk of observing by chance a falsely statistically significant result. Not only does the statistical significance need to be considered but also the clinical relevance when interpreting the results of statistical tests. The potential relationship between quality scores and citation factors was evaluated by the Spearman rank correlation coefficient. The association is better as this coefficient is closer to 1 (in absolute value). For example, although we observed a statistically significant correlation between PF and quality score, the correlation coefficient was far from 1. If an association exists, it is weak.
In conclusion, although a statistically significant but weak correlation was found between citation factors and the methodological quality of randomised trials in the field of lung cancer research, there is no convincing evidence that citation factors can be used to assess the quality of published scientific work. The use of citation factors must be restricted to commercial purposes, such as the comparison of different journals. Their application for the assessment for academic work is limited by their incapacity to evaluate the quality and scientific relevance of a publication. Other quality indicators are needed which take into consideration not only the journal in which the article is published and the nationality of the authors, but also the content of the paper. Quality scales, such as we have developed, are possible alternatives for objectively assessing the scientific importance and the potential contribution of scientific articles, at least in literature dealing with lung cancer.
![]() |
Acknowledgements |
---|
![]() |
Appendix A. The ELCWP quality score for phase III studies |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
1. Description of the selection criteria for the study
2. Randomisation method
3. Treatment description
4. Work-ups
5. Evaluation criteria description
6. Statistical methods and trial objectives description
B. Study analysis report
1. Analysis timing
2. Patients characteristics
3. Survival data and analysis
4. Antitumoural response data and analysis
5. Toxicity data and analysis
6. Prognostic factors for survival
7. Discussion
![]() |
Appendix B. Chalmers scoring system |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
1. Randomised methods
2. Analysis of efficacy of randomisation
3. Blinding in evaluation of response to treatment
4. Blinding in evaluation of interim results
5. Compliance
6. Follow-up schedules
7. A priori estimate of sample size
8. Withdrawal
9. Analysis of withdrawals
10. Response evaluation
11. A posteriori estimate of study power
B. External validity items
1. Patients characteristics
2. List of eligible but not enrolled patients
3. Description of the therapeutic regimen
4. Timing of events
5. Discussion of side effects
![]() |
Footnotes |
---|
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
2. Similowski T, Derenne JP. Bibliometry of biomedical periodicals. Rev Mal Respir 1995; 12: 543550.[ISI][Medline]
3. Cole JR, Eales NB. The history of comparative anatomy. Sci Prog 1917; 11: 578596.
4. Gross PLK, Gross EM. College libraries and chemical education. Science 1927; 66: 385389.
5. Seglen PO. Why the impact factor of journals should not be used for evaluating research. BMJ 1997; 314: 498502.[ISI][Medline]
6. Seglen PO. Citation rates and journal impact factors are not suitable for evaluation of research. Acta Orthop Scand 1998; 69: 224229.[ISI][Medline]
7. Luce S, Paesmans M, Berghmans T et al. Revue critique des études randomisées évaluant le rôle de la radiothérapie thoracique adjuvante à la chimiothérapie dans le traitement du cancer bronchique à petites cellules au stade limité. Rev Mal Respir 1998; 15: 633641.[ISI][Medline]
8. Sculier JP, Berghmans T, Castaigne C et al. Maintenance chemotherapy for small cell lung cancer: a critical review of the literature. Lung Cancer 1998; 19: 141151.[CrossRef][ISI][Medline]
9. Sculier JP, Berghmans T, Castaigne C et al. Best supportive care or chemotherapy for stage IV non-small cell lung cancer. Van Houtte P, Klastersky J, Rocmans P (eds): Progress and Perspectives in Lung Cancer. Berlin, Germany: Springer 1999; 199207.
10. Mascaux C, Paesmans M, Berghmans T et al. A systematic review of the role of etoposide and cisplatin in the chemotherapy of small cell lung cancer with methodology assessment and meta-analysis. Lung Cancer 2000; 30: 2336.[CrossRef][ISI][Medline]
11. Meert AP, Paesmans M, Berghmans T et al. Prophylactic cranial irradiation in small cell lung cancer: a systematic review of the literature with meta-analysis. BMC Cancer 2001; 1: 5.[CrossRef][Medline]
12. Sculier JP, Berghmans T, Paesmans M et al. La place de la chimiothérapie dans le traitement des cancers bronchiques non à petites cellules, non métastatiques. Rev Med Brux 2001; 22: 477487.[Medline]
13. Sculier JP, Ghisdal L, Berghmans T et al. The role of mitomycin in the treatment of non-small cell lung cancer: a systematic review with meta-analysis of the literature. Br J Cancer 2001; 84: 11501155.[CrossRef][ISI][Medline]
14. Berghmans T, Paesmans M, Lafitte JJ et al. Role of granulocyte and granulocytemacrophage colony stimulating factors in the treatment of small-cell lung cancer: a systematic review of the literature with methodological assessment and meta-analysis. Lung Cancer 2002; 37: 115123.[CrossRef][ISI][Medline]
15. Meert AP, Berghmans T, Lafitte JJ et al. Which progress have the new agents brought for chemotherapy of advanced non-small cell lung cancer? Eur Respir Rev 2002; 208216.
16. Chalmers TC, Smith H Jr, Blackburn B et al. A method for assessing the quality of a randomized control trial. Control Clin Trials 1981; 2: 3149.[CrossRef][ISI][Medline]
17. Luukkonen T. Bibliometrics and evaluation of research performance. Ann Med 1990; 22: 145150.[ISI][Medline]
18. Walsh E, Rooney M, Appleby L, Wilkinson G. Open peer review: a randomised controlled trial. Br J Psychiatry 2000; 176: 4751.
19. Egger M, Zellweger-Zahner T, Schneider M et al. Language bias in randomised controlled trials published in English and German. Lancet 1997; 350: 326329.[CrossRef][ISI][Medline]