Robert Jones and Agnes Hunt Orthopaedic Hospital, Oswestry,
1 Clinical Operational Research Unit, University College London,
2 St Albans City Hospital, St Albans,
3 Royal Hampshire County Hospital, Winchester,
4 Broomfield Hospital, Chelmsford,
5 Rheumatology Research Unit, University of Leeds, Leeds,
6 Harrogate Hospital, Harrogate,
7 Grimsby Hospital, Grimsby,
8 Basingstoke District Hospital, Basingstoke,
Medway Hospital, Gillingham and
10 Royal Hallamshire Hospital, Sheffield, UK
Correspondence to:
J. J. Dixey, Robert Jones and Agnes Hunt Orthopaedic and District Hospital NHS Trust, Oswestry, Shropshire SY10 7AG, UK.
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Methods.A hundred sets of radiographs of patients recruited with early rheumatoid arthritis (RA) were assessed using the Larsen scoring system. Digitized copies of these sets were then viewed on a computer screen and scored according to Larsen in a random order. The quality of the digitized image was also recorded. For each set of X-rays, the signed difference between the score from film and the score from the digitized images was calculated.
Results.A total of 95% of the digitized X-ray sets were scored successfully; 5% were not scored due to the images being unreadable. The mean difference between the two sets of scores was -1.2 (95% CI [-2.06, -0.37]). There was no trend in the difference with respect to the mean of the two scores (P>0.1).
Conclusion.The Larsen scoring of digitized X-ray images has been validated.
KEY WORDS: Early RA, Erosive disease, Digitized images, Assessment of X-rays
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In an attempt to facilitate X-ray storage, retrieval and subsequent analysis, the film originals have been digitized and stored electronically using inexpensive scanning technology. The purpose of this study is to ensure that X-rays scored with the Larsen method from an electronic image are equivalent to those scored from the film original.
![]() |
Patients and methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The digitization of the X-ray sets was carried out at the Robert Jones and Agnes Hunt Orthopaedic Hospital in Oswestry, UK. A Hewlett Packard Scanjet 4c flatbed scanner equipped with a Hewlett Packard Transparency Adaptor C2521B was used to create the 8 bit (256 grey levels) digitized images at a resolution of 75 pixels per inch. The images were stored with the tagged image file format (TIFF). Checks ensured that the images had the correct orientation.
For the current study, 100 ERAS patients were selected at random from those whose X-rays had been scanned in and not yet returned to the relevant centre. This included X-rays from four different centres. For the selected patients, a single set of hands and feet X-rays was selected at random from the set of films available. The randomization ensured that a suitable range of erosive severity was represented by the sample.
The sets of hands and feet X-rays were scored by a single medically qualified researcher (CS). Scoring was according to the Larsen [1] scoring system with a score (05) assigned to each of the proximal interphalangeal (PIP) and metacarpophalangeal (MCP) joints, the first interphalangeal (IP) joints, the second to fifth metatarsophalangeal (MTP) joints and to each wrist, with the wrist scores being multiplied by five when constructing the total Larsen score. This gave a total Larsen score in the range 0200.
During the scoring process, the 100 sets of X-ray films were examined and the scores for each joint recorded on a standard proforma. Scoring was restricted to sessions lasting 2 h with a break of at least 15 min between sessions. The sets of films were scored at a rate of 1015 sets per session.
After a gap of a day, the 100 sets of corresponding digitized images were scored in a different, randomized, order according to a pre-specified protocol. For this process, digitized images of the X-rays were viewed on a 21'' computer screen with the screen resolution set at 1024x768. The digitized images were viewed at the same size as the film originals. The quality of each set of images was judged to be `excellent', `fair', `poor' or `unreadable' by the scorer. If a digitized image was deemed `unreadable', then the study protocol prescribed that that set was not scored. Software had been prepared to allow the display of an onscreen score pad superimposed on the X-ray images, without obscuring the joints. Using the computer mouse and key pad, this allowed the scorer to enter details about the erosion status of each individual joint. As with the sets of films, scoring was restricted to sessions lasting 2 h, separated by a break of at least 15 min. Sets of digitized images were scored at approximately the same rate as film sets, i.e. 1015 sets per 2 h session.
Computerized manipulation of the digitized image was restricted by the protocol to adjusting the brightness and contrast of the image, and rotation such that the images had the standard orientation (fingers/toes pointing upwards). For some images, any adjustment of brightness or contrast was performed when the X-rays were originally scanned in. For the rest, this adjustment was performed at the time of scoring and a copy of the adjusted image file made. None of the images were adjusted at both the scanning and scoring stages.
Statistical analysis
A simple scatter plot of the Larsen score obtained from the standard films plotted against the Larsen score obtained from the digitized images was produced and the correlation factor, R, calculated. Although giving some impression of the correspondence between the two measurement methods, correlation analysis is recognized as being of limited value for comparing different methods of measuring the same quantity, particularly where the possible values have both lower and upper bounds. High R values are almost inevitable in such circumstances and there is a strong danger of interpreting such values too optimistically.
More useful information is obtained by using a graphical technique suggested by Bland and Altman [2]. For the cases where both the plain film score and the digitized image score were available, the difference between the scores was plotted against the mean of the two scores (see Fig. 2). This gives what is referred to as a BlandAltman plot [2]. Displaying information in this way allows several properties of the measurement methods to be examined. Typically, for two unbiased measurement methods, the BlandAltman plot takes the form of a uniform scatter of points symmetrically distributed around the horizontal axis. The overall vertical dispersion of the scatter of points reflects how closely the two measures agree. It is common to display horizontal lines showing the limits corresponding to 2 S.D. since these lines bracket ~95% of the scatter of points.
|
Another form of systematic bias can occur whereby the scatter of points is above the horizontal axis for low values and below for higher values (or vice versa). Linear regression was carried out and confidence limits on the slope of the regression line were calculated to examine any such trend [4].
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The mean of the 100 scores obtained from Larsen scoring of the original X-ray film was 25 (median 16.5, interquartile range 639).
Five (5%) of the digitized sets of images were judged to be unreadable and hence were not scored. A further 2 (2%) of the images were deemed to be of `poor' quality, but were scored nonetheless. Seventy-six (76%) sets of images were judged to be `excellent' and the remaining 17 (17%) were classed as `fair'.
Of the 100 sets of hands and feet X-rays studied, it was possible to compare 95 pairs of Larsen scores obtained from hard copy film and from the digitized image. For these 95 pairs, Fig. 1 shows a scatter plot of the Larsen score assigned to the original X-ray film plotted against the Larsen score assigned to the digitized image. There is a high degree of correlation with an R of 0.97.
|
Also shown in Fig. 2 is the line obtained from the linear regression. The slope of the regression line is -0.033 (95% CI [-0.069, 0.004]). This indicates that the distribution of differences between the paired scores is comparable across the range of Larsen scores examined.
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Therefore, this study has shown that the Larsen score taken from an electronic image is equivalent to that of the celluloid original. A recent US study [5] concluded that assessing digitized images gave results equivalent to those obtained from film originals, although this conclusion was based on the use of correlation analysis alone. The results presented here are of particular interest due to the relatively inexpensive and accessible technology used (the resolution of the images being poorer than that used for the US study [5] as a result) and the informative comparisons of the two sets of scores using BlandAltman analysis.
In studies which involve the analysis of X-ray data as a measure of outcome in RA, electronic storage and subsequent scoring of the X-ray images is a valid method, and is likely to overcome the logistic problem of filing and retrieval of multiple X-ray packets. Furthermore, in the future, it is likely that X-ray images will increasingly be presented in an electronic form in routine clinical practice. As shown here, the ability to perform formal analysis of X-ray images will be retained. These advantages have to be balanced against the fact that some digitized sets of X-rays (5% in this study) may be unreadable.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|