Reproducibility of the Banff schema in reporting protocol biopsies of stable renal allografts

James Gough1,, David Rush2, John Jeffery2, Peter Nickerson2, Rachel McKenna2, Kim Solez3 and Kiril Trpkov3

1 Department of Pathology and 2 Department of Nephrology, University of Manitoba Health Sciences Centre, Winnipeg, Manitoba, Canada and 3 Department of Laboratory Medicine and Pathology, University of Alberta, 5B4.02 WC MacKenzie Health Sciences Centre, Edmonton, Alberta, Canada



   Abstract
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Background. There is evidence that biopsy of stable renal allografts may be of value in predicting chronic allograft nephropathy, the main cause of graft loss. However, the reproducibility of such histological evaluation has not been tested in this setting. We tested the reproducibility of the Banff schema for this purpose.

Methods. We rated acute and chronic changes in 184 protocol biopsies. Individual pathologists at two different Canadian transplant centres reported independently.

Results. There was agreement in 73.53, 42.86, and 77.08% of cases in assigning a diagnosis of acute rejection, borderline changes (as defined in the schema), and no acute rejection, respectively. Applying kappa statistics, there was very good agreement in making the diagnosis of acute rejection vs no acute rejection (kappa 0.77). There was good inter-observer agreement in scoring glomerulitis, intimal arteritis, interstitial infiltrates, tubulitis, and arteriolar hyalinosis. Rating chronic changes also gave good inter-observer agreement (kappa=0.53, 0.65, and 0.62, respectively, for mild, moderate, and severe chronic allograft nephropathy). Agreement on transplant glomerulopathy was, however, poor.

Conclusions. We conclude that the Banff classification provides a reproducible method for the histological assessment of protocol renal allograft biopsies in stable grafts. Such biopsies may be valuable in detecting subclinical rejection and early chronic allograft nephropathy and may also be used as surrogate end-points in the evaluation of therapy to prevent the latter.

Keywords: Banff; inter-observer agreement; protocol renal biopsy; rejection



   Introduction
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Protocol renal allograft biopsy is a potentially valuable diagnostic and research tool. Important applications of the technique include uncovering ‘subclinical rejection’ [1,2], the prediction of chronic allograft nephropathy [3], and the establishment of a surrogate end-point for therapeutic trials for preventing the latter [4]. The reproducibility of the histological interpretation of protocol biopsies is thus of great importance but, to our knowledge, no published study has specifically examined this issue. In the present study we applied the Banff schema to a large number of protocol biopsies interpreted by pathologists at two large Canadian transplant centres.



   Subjects and methods
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
All biopsies performed between April 1992 and April 1995 on 64 consecutive patients at one centre who had consented to take part in a clinical trial on the effects of treating ‘subclinical rejection’ were examined. The outcome of this study is published elsewhere [5]. Biopsies were done according to a protocol at 1, 2, 3, 6, and 12 months after transplantation. Slides were prepared according to standard techniques. One hundred and eighty-four biopsies contained at least seven glomeruli and an artery, which is defined as adequate for interpretation [6]. Biopsies were scored according to the Banff schema [6] for evidence of acute and chronic changes and for arteriolar hyalinosis. The schema recognizes three types of acute cell-mediated rejection; tubulointerstitial, vascular, and severe vascular rejection based on the extent of inflammation present in the interstitium, tubules (‘tubulitis’), and vessels. A ‘borderline’ category is also used. Acute antibody-mediated rejection is also categorized in the schema. Chronic changes are graded using a semi-quantitative score (score 0–3 (normal to severe)) for interstitial fibrosis and tubular atrophy. Sclerosing transplant arteriopathy and transplant glomerulopathy are graded separately as specific indicators of chronic rejection. Finally, there is a category for ‘other diagnoses’ recognized in the allograft.

All biopsies were interpreted by pathologists at two transplant centres: by either one of two pathologists at centre A and by one pathologist at centre B. All of these individuals were blinded to each other's reports and to clinical information.

We applied kappa statistics to our results. This statistical technique has been widely used to study the reproducibility of histological diagnoses, especially those using a grading or scoring system [7].



   Results
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Table 1Go shows the level of agreement between the two centres in assigning one of three histological diagnoses to each biopsy, namely ‘acute rejection’, ‘no acute rejection’, and ‘borderline changes’.


View this table:
[in this window]
[in a new window]
 
Table 1.  Inter-observer agreement between two centres in the assignment of three histological diagnoses to protocol renal transplant biopsies

 
The kappa value for the diagnosis of acute rejection vs no acute rejection (the latter including the ‘borderline’ category) was 0.77.

Taking these three diagnoses (no acute rejection, borderline, and acute rejection) as a scale from 0 to 2, the value of kappa in assigning them to a biopsy was 0.69 with 95% confidence limits of 0.60–0.79 (see Table 2Go). As can be deduced from Table 2Go, the prevalence of the three categories (agreed upon by both centres) ‘no acute rejection’, ‘borderline’, and ‘acute rejection’ in the 184 biopsies was 27, 13 and 40%, respectively.


View this table:
[in this window]
[in a new window]
 
Table 2.  Inter-observer agreement between two centres in assigning three histological diagnoses to 184 protocol renal allograft biopsies

 
Table 3Go shows the kappa values for inter-observer agreement in scoring glomerulitis, intimal arteritis, interstitial infiltrates, and tubulitis from 0 to 3. Glomerulitis and intimal arteritis were uncommon (10 and seven biopsies, respectively).


View this table:
[in this window]
[in a new window]
 
Table 3.  Kappa values obtained from inter-observer agreement data in rating each of the four items used to determine the presence or absence of acute rejection in the Banff schema

 
Although the histological diagnosis alone defined rejection in these cases, in 17 of the 186 biopsies (9.3%), a clinical diagnosis of rejection was made at the time of protocol biopsy. All of these were reported as acute rejection in the biopsy.

Unlike acute rejection, chronic graft injury is not well defined either clinically or histologically. In the Banff schema the histological lesions of chronic graft damage are rated according to the severity of interstitial fibrosis and tubular atrophy. Transplant glomerulopathy and sclerosing transplant vasculopathy are taken as indicators of chronic rejection. Table 4Go shows kappa values for these four items as rated by the two centres.


View this table:
[in this window]
[in a new window]
 
Table 4.  Kappa scores obtained for inter-observer agreement in rating each of the four items used to determine the presence and degree of chronic rejection

 
The kappa values for mild, moderate, and severe chronic allograft nephropathy were 0.53, 0.65 and 0.62, respectively (with 95% confidence intervals of 0.50–0.69). Arteriolar hyalinosis is scored from 0 (no arteriolar hyalinosis) to 3 (severe arteriolar hyalinosis) in the Banff system. Inter-observer agreement for hyalinosis yielded a kappa value of 0.63 (with 95% confidence intervals of 0.58–0.73).



   Discussion
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 
Our study shows that the Banff system offers a reproducible histological diagnosis of acute rejection in stable renal allograft patients and we have shown elsewhere that this finding is clinically significant for graft function at 2 years [5]. There is a greater discrepancy in assigning the ‘borderline’ category but the practical consequences of this were minimal insofar as centre B tended to call ‘borderline’ what centre A called ‘no acute rejection’. In only one case was the ‘borderline’ diagnosis of centre B called ‘acute rejection’ by centre A. While Saad et al. [8] have shown that the ‘borderline’ category may be important in diagnostic graft biopsies, they have also pointed out that this conclusion cannot be extrapolated to protocol biopsies. Nevertheless, the significance of borderline changes in protocol biopsies must be further evaluated.

A kappa value greater than 0.75 is believed to signal excellent agreement in diagnosis [7]. Concordance for the biopsy diagnosis of invasive carcinoma of the cervix, for example, is reported by one group as giving a kappa value of 0.832 [9].

There have been few studies of inter-observer agreement in the biopsy diagnosis of renal allograft rejection. One such report showed that knowledge of the clinical findings greatly influenced biopsy interpretation [10]. Marcussen et al. [11] examined the reproducibility of the Banff schema in 77 biopsies, mostly clinically indicated, from 46 patients. They found a kappa value of 0.56 (compared with our figure of 0.77) for inter-observer agreement in the diagnosis of acute rejection vs no acute rejection (the latter, including the borderline category).

Recently, Colvin et al. [12] reported a kappa value of 0.80 in a study of the reproducibility of histological criteria for acute rejection devised by the Co-operative Clinical Trials in Transplantation (CCTT) group. More than 90% of these biopsies were performed for diagnostic rather than protocol purposes. The CCTT schema does not evaluate chronic changes.

In evaluating chronic allograft nephropathy, our kappa values were 0.53, 0.65, and 0.62 for mild, moderate, and chronic allograft nephropathy, respectively. While not as good as those for acute rejection, these figures are comparable with those achieved in assessing inter-observer agreement, for example, in the grading of breast carcinoma [13].

Arteriolar hyalinosis is an important finding in graft biopsies. It may signify donor disease, chronic rejection, or cyclosporin toxicity. We found very good inter-observer agreement in grading this parameter (kappa=0.63).

Our kappa value (0.14) for ‘cg’ (transplant glomerulopathy) is poor. The presence and extent of splitting of the glomerular basement membrane (GBM) is the principal marker for cg in the Banff schema [6]. Centre B tended to over-diagnose GBM splitting especially in the presence of glomerulosclerosis. Both centres agreed that no transplant glomerulopathy was present in 86.4% of cases. In a further 11.4% of cases the scores for cg differed by only 1. The inappropriate rating of cg was acknowledged by centre B.

Although glomerulosclerosis is not a specific indicator of transplant glomerulopathy, it may be the end-result of that condition. Whatever its aetiology, it may be an important prognostic finding in a transplant biopsy and for this reason we have incorporated glomerulosclerosis as a measure of chronic injury in other studies [5] and the Banff schema recommends that it be independently recorded in biopsy reports [6].

The overall results of this exercise in inter-observer comparison in rating protocol renal allograft biopsies are encouraging and show good agreement in assigning a diagnosis of acute rejection (kappa=0.77). While the kappa values for rating the individual parameters evaluated in arriving at this diagnosis are lower than this (ranging from 0.42 to 0.50), there is no ‘right’ kappa established for such parameters and our kappa values must await comparison with those of other studies.

In summary, we believe that the Banff schema provides a reproducible method of assessing protocol renal transplant biopsies and facilitates reliable studies of the significance and long-term consequences of ‘subclinical rejection’ and the establishing of a surrogate endpoint for chronic rejection.



   Notes
 
Correspondence and offprint requests to: James Gough, MD, Department of Pathology, Foothills Medical Centre, 1403 29th Street North West, Calgary, Alberta T2N 2T9, Canada. Email: gough{at}ucalgary.ca. Back



   References
 Top
 Abstract
 Introduction
 Subjects and methods
 Results
 Discussion
 References
 

  1. Rush DN, Jeffery JR, Gough J. Sequential protocol biopsies in renal transplant patients. Transplantation1995; 59: 511[ISI][Medline]
  2. Shapiro R, Randhawa P, Jordan ML et al. An analysis of early renal transplant protocol biopsies-the high incidence of subclinical tubulitis. Am J Transplant2001; 1: 47–50[ISI][Medline]
  3. Kirk AD, Jacobson LM, Heisey DM, Radke NF, Pirsch JD, Sollinger HW. Clinically stable human renal allografts contain histological and RNA-based findings that correlate with deteriorating graft function. Transplantation1999; 68: 1578–1582[ISI][Medline]
  4. Seron D, Moreso F, Ramon JM et al. Protocol renal allograft biopsies and the design of clinical trials aimed to prevent or treat chronic allograft nephropathy. Transplantation2000; 69: 1849–1855[ISI][Medline]
  5. Nickerson P, Jeffery J, Gough J et al. Identification of clinical and histopathologic risk factors for diminished renal function two years post-transplant. J Am Soc Nephrol1998; 9: 482[Abstract]
  6. Racusen LC, Solez K, Colvin RB et al. The Banff 97 working classification of renal allograft pathology. Kidney Int1999; 55: 713–723[ISI][Medline]
  7. Silcocks PBS. Measuring repeatability and validity of histologic diagnosis. J Clin Pathol1983; 36: 1269[Abstract]
  8. Saad R, Gritsch HA, Shapiro R et al. Clinical significance of renal allograft biopsies with ‘borderline changes’ as defined by the Banff Schema. Transplantation1997; 64: 992–995[ISI][Medline]
  9. Ismail SM, Colclough AB, Dinnen JS et al. Observer variation in histopathological diagnosis and grading of cervical intraepithelial neoplasia. Br Med J1989; 298: 707[ISI][Medline]
  10. Sorof JM, Vartanian RK, Olson SJ et al. Two cores are better than one in the diagnosis and management of renal allograft rejection. J Am Soc Nephrol1995; 6: 1116 (Abstract)
  11. Marcussen N, Olsen TS, Benediktsson, Racusen L, Solez K. Reproducibility of the Banff classification of renal allograft pathology. Transplantation1995; 60: 1083[ISI][Medline]
  12. Colvin RB, Cohen AH, Salontz C et al. Evaluation of pathologic criteria for acute renal allograft rejection. J Am Soc Nephrol1997; 8: 1930[Abstract]
  13. Reed W, Hannisdal E, Boehler PJ, Gundersen S, Host H, Marthin J. The prognostic value of p53 and c-erb B-2 immunostaining is overrated for patients with lymph node negative breast carcinoma: a multivariate analysis of prognostic factors in 613 patients with a follow-up of 14–30 years. Cancer2000; 88: 804–813[ISI][Medline]
Received for publication: 16. 1.01
Accepted in revised form: 8. 1.02