|
|
||||||||
Eur J Cardiothorac Surg 2006;29:431-433
© 2006 Elsevier Science NL
a Clinical Operational Research Unit, University College London, UK
b University Hospital Birmingham, UK
c Guy's and St. Thomas's Hospitals Medical School, London, UK
Received 7 October 2005; received in revised form 19 December 2005; accepted 21 December 2005.
* Corresponding author. Fax: +44 207 813 2814. (Email: s.gallivan{at}ucl.ac.uk).
| Abstract |
|---|
|
|
|---|
Key Words: Risk models Audit Cardiac surgery Mortality
| 1. Introduction |
|---|
|
|
|---|
In assessing the merits of a scoring system, it is important that it should be seen to give reasonably good predictions across the whole spectrum of cases that are typically encountered. If there is systematic under- or overestimation of risk for any part of the risk spectrum, this potentially degrades the audit process and indeed may indirectly promote poor practice whereby cases are avoided if the notional risk score falls short of the truth.
Here we describe a simple graphical method that can be used in the assessment of risk scoring systems. We apply it in the case of the additive version of the EuroSCORE highlighting systematic biases.
| 2. Methods |
|---|
|
|
|---|
There is an irksome complication. In additive EuroSCORE, more than 95% of cases in a typical case mix are scored in integers from 0 to 10, so tied scores are frequent. This poses the problem of how to represent the cumulative plot of actual deaths. The ordering chosen for large clusters of cases of equal risk could give a chart with upward steps in different places while the visual impression of goodness of fit is crucially dependent on the choice of where these steps occur. This raises the possibility of accidental (or deliberate) bias whereby cases might be ordered in such a way so as to give a misleading impression of good or bad fit.
We resolve this problem by assuming that all possible orderings of cases with equal predicted risk are equally valid. Under this assumption, an unbiased representation of the outcome data is given by simply taking the average of all possible orderings. To describe this process, we have coined the acronym MADCAP (Mean Adjusted Deaths Compared Against Predictions). Even with moderately sized data sets, there may be numerous patients with tied risk scores and thus an alarmingly high number of different ways of ordering the cases. Fortunately, a simple method is sufficient to resolve this difficulty. Using straight lines in the cumulative graph of actual mortality gives an unbiased representation of deaths within groups of patients with tied risk scores (see Appendix A).
To illustrate the use of the MADCAP method, we use data prospectively collected in two large units in England comprising a case by case record of EuroSCOREs and outcome (in-hospital mortality). The data sets were anonymised for both patients and surgeons and were merged for analysis.
| 3. Results |
|---|
|
|
|---|
|
|
| 4. Discussion |
|---|
|
|
|---|
The technique is intended to be part of the research methodology of those engaged in the development or evaluation of risk models rather than part of day to day clinical practice. However, as such risk models are becoming an increasingly common part of audit and communication with patients, it is important that all clinicians have an appreciation of how accurate such risk models are and of their strengths and weaknesses.
With regards to the data presented in Figs. 1 and 2, it would seem that there are systematic discrepancies between the actual mortality amongst the cases studied and that expected according to the additive EuroSCORE risk model. Such discrepancies are not exposed by use of the ROC curve (see for example [7]).
Fig. 1 shows that mortality was lower than predicted overall. Fig. 2 shows that mortality was greater than predicted amongst low-risk cases (0% and 1%) and perhaps also amongst high-risk patients (
7%).
It seems self-evident that prediction of 0% mortality provides an unrealistic quality target. In this range, average mortality is always likely to be higher than the prediction since it cannot be below it. The same argument could be made for all predictions
1% and yet 20% of the cases are scored at 0% or 1% risk. Loading the practice of beginners, for example with these cases and then comparing the risk adjusted results with surgeons with a wider case mix could suggest apparently less good results.
Apart from the very low-risk cases, the score systematically favours risk avoiding behaviour as the risk model underestimates mortality for 26% prediction but not at 7% and above (Fig. 2). This has been noted previously [8] but in an analysis depending on already grouped data sets. This failure to reflect increasing risk accurately is unfair on surgeons taking on the very patients whose lives are most at risk from the natural history of the disease since most elements in the perioperative risk score are also markers for risk of death without surgery. These are the patients who have the most to gain from heart surgery because it is where surgery makes the biggest difference, and a scoring system, which rewards risk averse case selection is not in the patients interests. If we are to use risk adjusted death rates as a comparative index of performance, it is essential that we continually review the performance of the risk adjustment method. The MADCAP charts presented in this paper provide a useful tool to this end.
The technique described in this paper is designed purely to detect systematic flaws in a risk model. For instance, had the method we suggest been used when the additive EuroSCORE was developed, the systematic bias that is a feature of this scoring system would have been apparent. To make recommendations regarding how to improve a risk model, if indeed it is judged that improvements are required, is beyond the scope of our current work. That said, one option would be to use the discrepancies highlighted by the use of MADCAP to adjust the risk score. This would have the advantage of making the model better, according to the criteria of the visual appearance of the MADCAP chart, but may well have disadvantages in terms of other criteria. A MADCAP chart on its own is not sufficient to judge whether a risk model is good or bad. The acid test is to consider what uses the risk model will have and whether it is fit for these purposes.
| Appendix A |
|---|
|
|
|---|
Suppose one is concerned with a data set with clusters of cases, operations within each cluster having a tied risk score. For different permutations of the order of cases, a different cumulative mortality graph might result. It will be shown that by taking the mean of all such cumulative mortalities, over all possible permutations of cases in the cluster, gives a linear increasing function. In view of this, amalgamating such cumulative plots over all the clusters gives a piecewise linear graph expressing the mean cumulative mortality for the whole dataset. This is the basis of the MADCAP charting method.
The proof of this is an example where the intuitively obvious (linear interpolation) is troublesome to justify mathematically.
Consider a single cluster of N cases with tied risk score for which there have been D deaths.
There are N! possible permutations of the cases, each of which could be used to construct a cumulative chart of mortality. Although requiring some thought, it can be seen that for 1
i
N, the proportion of permutations within the cluster that have a death assigned to the ith case is D/N and for these, the cumulative mortality increases by 1, while there is no increase for other permutations. Thus at the ith step, the mean increase in cumulative mortality is D/N, 1
i
N.
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
C. A. Mestres, M. A. Castro, E. Bernabeu, M. Josa, R. Cartana, J. L. Pomar, J. M. Miro, J. Mulet, and the Hospital Clinico Endocarditis Study Group Preoperative risk stratification in infective endocarditis. Does the EuroSCORE model work? Preliminary results Eur. J. Cardiothorac. Surg., August 1, 2007; 32(2): 281 - 285. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Gallivan, M. Utley, D. Pagano, and T. Treasure Reply to Lim: Concerning MADCAP plots Eur. J. Cardiothorac. Surg., June 1, 2007; 31(6): 1152 - 1152. [Full Text] [PDF] |
||||
![]() |
E. Lim Interpreting MADCAP: parallelism not divergence Eur. J. Cardiothorac. Surg., June 1, 2007; 31(6): 1151 - 1152. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |