|
|
||||||||
Eur J Cardiothorac Surg 2001;20:1176-1182
© 2001 Elsevier Science NL
Cardiac Surgery Department, Gasthuisberg University Hospital, 3000 Leuven, Belgium
Received 25 September 2000; received in revised form 11 June 2001; accepted 24 September 2001.
Corresponding author. Tel.: +32-16-344339; fax: +32-16-344616
e-mail: paul.sergeant{at}uz.kuleuven.ac.be
| Abstract |
|---|
|
|
|---|
Key Words: Coronary artery bypass grafting EuroSCORE Scoring system Validation Quality control
| 1. Introduction |
|---|
|
|
|---|
Patients also demand the highest quality of care but in addition they have received legal guarantees to be informed of their individual risk. This knowledge of the individual risk is also essential to the attending physician in evaluating correctly the appropriateness of the therapy. These partners in health care trust the scientific community to provide simple prediction systems which are stable for larger samples of patients but also stable for the individual patient.
We have studied earlier the overall behaviour [1] and limitations of complex domain-specific prediction models in coronary surgery in early and late follow-up. The purpose of this manuscript is to study the overall and spectral behaviour of a simple but recent non-domain-specific prediction model of in-hospital mortality, the European system for cardiac operative risk evaluation (EuroSCORE) [2,3], in a large independent single centre population of coronary surgery patients with a wide spectrum of risk.
| 2. Materials and methods |
|---|
|
|
|---|
|
|
2.4. Statistical analysis
The % difference between the predicted and observed hospital mortality was calculated using the formula: ((predicted deaths)-(observed deaths))x100/(predicted deaths).
The Fisher exact test was used for the contingency tables and logistic regression for the evaluation of the EuroSCORE value as an incremental predictor. The contingency tables were formed comparing the number of predicted events and N-predicted events versus observed events and N-observed events.
A receiver operating characteristic curve (ROC) [4,5] evaluated the predictive performance (discriminatory power) of the EuroSCORE. This was created by using each EuroSCORE (022) as a theoretical cut-off point to predict in-hospital mortality. The sensitivity and the specificity of the prediction were calculated for each EuroSCORE. The sensitivity was then plotted (Fig. 1) versus the 100-specificity and the points were interconnected. The area under this curve is a measure of the discriminatory power of the test.
|
After the first step in the spectral analysis with the evaluation for every single risk category (Table 2 and Fig. 2) , the patients were grouped (Table 3) in a second step of data exploration. In the third step a cumulative approach is tested. A gradually increasing sample size is created, starting with 0% predicted risk and in a stepwise manner adding one additional risk category. Each of these samples is considered a separate sample and for each sample the difference (Fig. 3) between predicted and observed mortality is calculated.
|
|
|
| 3. Results |
|---|
|
|
|---|
The EuroSCORE predicted 101.8 deaths (5.0%). The Fisher test P value of the total number of observed versus expected hospital deaths was 0.14. With the extra-corporeal procedures 69 deaths were observed versus 81.6 predicted (P=0.32) and with the off-pump procedures 12 deaths were observed versus 20 predicted (P=0.21).
Logistic regression identified the EuroSCORE risk prediction as an incremental predictor (P<0.0001) but describing only 20% of the variability.
The EuroSCORE created an area under the ROC curve of 0.83±0.03 for the complete dataset (0.81±0.03 for the extra-corporeal procedures). The highest discriminative accuracy in predicting in-hospital death for the complete dataset was obtained with 8% EuroSCORE risk (64% sensitivity and 87% specificity).
The spectral analysis is performed in a stepwise manner: individual, grouped and then cumulative.
Table 2 and Fig. 2 are the first steps in the spectral analysis and present the number of patients at risk, the number of predicted deaths, the number of observed deaths and the % observed mortality for every EuroSCORE risk value.
The reduced number of events in the low-risk groups and the small sample sizes in the large risk groups induce statistical limitations. These are avoided by grouping the patients in Table 3 in three larger samples. This exploration identified an over score in the EuroSCORE range 08 (57%, P<0.0001). This over score pivots in the range 911 with a correct prediction (-2%, P=1), but an under score pivots in the range 1222 (-133%, P=0.003).
The cumulative approach is the third step of the spectral analysis. It simulates the predictive accuracy of the EuroSCORE in samples increasing in risk and sample size in a stepwise manner (Fig. 3). The plot demonstrates that the considerable over score in the lower risk domains is gradually reduced by bringing additional risk into the analysis.
| 4. Discussion |
|---|
|
|
|---|
4.2. Strengths and weaknesses of the study sample
A validation dataset can have limitations in several domains: sample size, origins (single- or multi-institutional), sample population, sample timeframe and finally sample spectrum.
The size of the study dataset is adequate for overall validation and for ROC analysis (50 events needed). The study sample size exceeds the original validation sample (n=1497) of the EuroSCORE system. The size of the study set, on the contrary, is inadequate for analysis of every single calculated value due to the limited number of events in the low-risk patients and an insufficient number of patients at risk in the high-risk domain. This becomes apparent in Fig. 2 with the 95% confidence limits of the observations. This limitation has been partly solved by grouping the risk domains and by the final stepwise approach. The total sample certainly exceeds the annual coronary artery bypass grafting (CABG) production of nearly any cardiothoracic centre, and simulates thereby the number of patients a centre would submit to a regional quality control centre using the EuroSCORE as a tool for correction in patient variability.
This single institution dataset carries the risk that any inference can be reduced to a centre effect but avoids inter-institutional strata in the analysis, thereby possibly limiting the applicability of the inferences.
The dataset starts in January 1997 and includes all patients operated on in a similar timeframe (September to November 1995) as the EuroSCORE dataset.
One of the weaknesses of the study sample is the inclusion of only CABG patients in the analysis, but CABG patients represented 63% of the EuroSCORE dataset [7]. An unstable prediction in this domain would exclude stable predictions in aggregated adult cardiac surgery datasets, including patients with rare diseases and large risk at surgery. This dataset is a consecutive departmental series and does therefore not include the bias possibly associated with patient selection.
The current study dataset with an average value of 5.0% risk identifies this dataset as a higher risk dataset of consecutive patients. The national average [8] of the EuroSCORE patients varied from 3.7% risk for the 4277 German patients to 4.7% risk for the 2422 Spanish patients. The high risk within the study dataset is confirmed when compared with the CABG subset within the original EuroSCORE dataset, where the average EuroSCORE value was 3.3% risk. The presence of these high-risk patients, including those in cardiac arrest at the start of surgery, is an additional challenge for the scoring system. The EuroSCORE publications are unclear about the numbers of patients at risk in these highest risk categories.
4.3. Strengths of the EuroSCORE
The EuroSCORE dataset is a cross-section of contemporary European cardiac surgery and the applicability of the system in most European countries should be one of its strengths. The definitions are well-described and the selected variables are not influenced by surgical technique. The simplicity of the calculation, on paper or via an information technology tool, is one of the biggest assets of this system.
The EuroSCORE is an open system; its coefficients are in the public domain. A risk-adjustment system with undisclosed coefficients (e.g. Society of Thoracic Surgery system) is unacceptable.
4.4. Weaknesses of the EuroSCORE
The first limitation that becomes obvious is the limitation of the spectrum. A 0% risk prediction is unrealistic in any medical intervention. The upper limitation of the prediction to 22% risk prediction for the highest risk patients is similarly unrealistic. An average 30 day mortality of 35% [9] was observed in a previous study for all patients in cardiopulmonary resuscitation undergoing CABG. The number of patients with high operative risk in the EuroSCORE dataset is unavailable. The information about the distribution of risk in the original EuroSCORE dataset is limited to the statement that 29% of the developmental dataset had a score above or equal to 6. The low risk of the original EuroSCORE dataset expressed by the low average EuroSCORE value but also by the standard deviation of only 2.5, versus our own standard deviation of 4, is an additional indication.
The overall comparison gives an impression of overall good prediction but the cumulative approach (Fig. 3) indicates that the overall prediction would have been statistically incorrect, if the study dataset would have been limited to patients with a risk under 11%. Additional proof can be found in the observation that only 20% of the total variability in hospital mortality is explained by the EuroSCORE.
The ROC test used gives information about the discriminatory power, thereby indicating the relation between the scale and the identification of the survivors from the non-survivors, but gives no information about the actual calculated individual values. The obtained ROC value of 0.83 is higher than the 0.76 obtained in the original validation of the EuroSCORE system for the validation dataset. The ROC analysis was also performed with the variable age instead of the EuroSCORE. The ROC value was 0.64 for this single variable. The obtained ROC value of 0.83 is only borderline adequate since it is only similar to the discriminatory power of the prediction of rain [10,11] in major weather datasets. This seems a low level of discriminatory power but it might not be. It could be an observation that predicting death after cardiac surgery is as uncertain as predicting rain. We have intentionally not compared the EuroSCORE with any of the other existing (outdated?) scoring systems, and wanted to evaluate this as a stand-alone project.
The maximum obtainable sensitivity and specificity (64% sensitivity and 87% specificity) precludes that the EuroSCORE will misclassify many patients, even at peak performance. This is essential information for any quality control audit.
The most important limitation of the EuroSCORE lies in the over score in the low-risk and the under score in the very high-risk domains with the pivot at the 12% margin. This observation of under scoring in low-risk patients and over scoring in high-risk patients is frequently observed in scores based on logistic regression methodologies in order to prevent distortion of the model in the medium risk groups.
4.5. Use of the EuroSCORE in quality control and informed consent
Quality of care is the optimal balance between costs and benefits of a procedure. The early peri-procedural timeframe is only one aspect of the medical cost, extends for several months and is therefore an incomplete interval for the evaluation of quality of care. The EuroSCORE scores only the mortality during the hospital stay [12], which is a short part of the early peri-procedural timeframe and therefore even less of an appropriate interval.
The possibility of using the EuroSCORE as a risk-severity score of datasets for CUSUM [13] (cumulative sum) plotting is obvious. The CRAM [14] (cumulative risk-adjusted mortality) plotting technique credits or debits a surgeon or centre according to the predicted risk of the operation. Most of the scoring systems can be used but the used system needs to be accurate for the complete spectrum of patients. The observed under scoring of the hospital mortality for the high-risk patients, if the EuroSCORE is used, will penalize the centres or surgeons more than appropriately if a high-risk patient dies.
The neutralization of all scores above a certain value (e.g. 11 or 12) in quality control systems such as the CRAM might be advisable. The higher risk patients would thereby not influence the direction of these plots. Fig. 4 shows a CRAM plot of the study population with (CCRAM or corrected cumulative risk plot) and without (CRAM) a neutralization of the EuroSCORE. The CCRAM plot does not penalize the analyzed institution for the very high-risk patients and shows from the first patients a positive performance versus the scoring system. A certain prerequisite of CRAM analysis, whether or not using the EuroSCORE, is clearly an analysis of the spectrum and distribution of risk in the studied dataset.
|
4.6. Conclusion
The EuroSCORE is an easy scoring system including most of the usual risk factors. The EuroSCORE allows a common European language in cardiac surgical risk-adjustment, but any use of a language should be accompanied with the knowledge of its descriptive limitations. The overall acceptable prediction of the hospital mortality using the EuroSCORE is the consequence of an overestimation of the low risk and an underestimation of the high risk. A centre effect can be the cause of the considerable overestimation of the lower risk. This centre effect is part of the variability in outcome describing the quality of delivered care. In this particular analysis it is difficult to reduce the variability between observed and predicted only to a centre effect since this would indicate an over performance of the centre in the low-risk and an under performance of that same centre in the high-risk populations.
Some quality control systems such as the CUSUM and CRAM plots using this scoring system might require adaptations for the higher risk patients. Similar studies are needed before a final evaluation of the EuroSCORE can be made, but a critical warning in the high-risk domain is appropriate. At this time, we have adopted the EuroSCORE for most of our quality control systems.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Zierer, G. Wimmer-Greinecker, S. Martens, A. Moritz, and M. Doss Is transapical aortic valve implantation really less invasive than minimally invasive aortic valve replacement? J. Thorac. Cardiovasc. Surg., November 1, 2009; 138(5): 1067 - 1072. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Hirose, H. Inaba, C. Noguchi, K. Tambara, T. Yamamoto, M. Yamasaki, K. Kikuchi, and A. Amano EuroSCORE predicts postoperative mortality, certain morbidities, and recovery time Interactive CardioVascular and Thoracic Surgery, October 1, 2009; 9(4): 613 - 617. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Brown, H. V. Schaff, M. E. Sarano, Z. Li, T. M. Sundt, J. A. Dearani, C. J. Mullany, and T. A. Orszulak Is the European System for Cardiac Operative Risk Evaluation model valid for estimating the operative risk of patients considered for percutaneous aortic valve replacement? J. Thorac. Cardiovasc. Surg., September 1, 2008; 136(3): 566 - 571. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Klein and S. A. M. Nashef Perception and Reporting of Cardiac Surgical Performance Seminars in Cardiothoracic and Vascular Anesthesia, September 1, 2008; 12(3): 184 - 190. [Abstract] [PDF] |
||||
![]() |
E. A. Grossi, C. F. Schwartz, P.-J. Yu, U. P. Jorde, G. A. Crooke, J. B. Grau, G. H. Ribakove, F. G. Baumann, P. Ursumanno, A. T. Culliford, et al. High-Risk Aortic Valve Replacement: Are the Outcomes as Bad as Predicted? Ann. Thorac. Surg., January 1, 2008; 85(1): 102 - 107. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Shanmugam, M. West, and G. Berg Additive and logistic EuroSCORE performance in high risk patients Interactive CardioVascular and Thoracic Surgery, August 1, 2005; 4(4): 299 - 303. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Caus, Y. Seree, P. Marin, M. Khairi, A. Bakkali, J. C. Guillen, J. L. Bonnet, and D. Metras Off-pump coronary surgery in selected patients: better early outcome but more recurrence of angina? Interactive CardioVascular and Thoracic Surgery, August 1, 2005; 4(4): 322 - 326. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-C. Chen, C.-C. Wang, S.-R. Hsieh, H.-W. Tsai, H.-J. Wei, and Y. Chang Application of European system for cardiac operative risk evaluation (EuroSCORE) in coronary artery bypass surgery for Taiwanese Interactive CardioVascular and Thoracic Surgery, December 1, 2004; 3(4): 562 - 565. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Zingone, A. Pappalardo, and L. Dreas Logistic versus additive EuroSCORE. A comparative assessment of the two models in an independent population sample Eur. J. Cardiothorac. Surg., December 1, 2004; 26(6): 1134 - 1140. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gogbashian, A. Sedrakyan, and T. Treasure EuroSCORE: a systematic review of international performance Eur. J. Cardiothorac. Surg., May 1, 2004; 25(5): 695 - 700. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.A. Albert, J.A. Walter, B. Arnrich, W. Hassanein, U.P. Rosendahl, S. Bauer, and J. Ennker On-line variable live-adjusted displays with internal and external risk-adjusted mortalities. A valuable method for benchmarking and early detection of unfavourable trends in cardiac surgery Eur. J. Cardiothorac. Surg., March 1, 2004; 25(3): 312 - 319. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. K. Toumpoulis, C. E. Anagnostopoulos, J. J. DeRose, and D. G. Swistel European system for cardiac operative risk evaluation predicts long-term survival in patients with coronary artery bypass grafting Eur. J. Cardiothorac. Surg., January 1, 2004; 25(1): 51 - 58. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. V.H.P. Huijskes, P. M.J. Rosseel, and J. G.P. Tijssen Outcome prediction in coronary artery bypass grafting and valve surgery in the Netherlands: development of the Amphiascore and its comparison with the Euroscore Eur. J. Cardiothorac. Surg., November 1, 2003; 24(5): 741 - 749. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Sergeant, B. Meyns, P. Wouters, R. Demeyere, and P. Lauwers Long-term outcome after coronary artery bypass grafting in cardiogenic shock or cardiopulmonary resuscitation J. Thorac. Cardiovasc. Surg., November 1, 2003; 126(5): 1279 - 1286. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. L. Grunkemeier, Y. X. Wu, and A. P. Furnary Cumulative sum techniques for assessing surgical results Ann. Thorac. Surg., September 1, 2003; 76(3): 663 - 667. [Full Text] [PDF] |
||||
![]() |
P. Michel, F. Roques, S. A.M. Nashef, and The EuroSCORE Project Group Logistic or additive EuroSCORE for high-risk patients? Eur. J. Cardiothorac. Surg., May 1, 2003; 23(5): 684 - 687. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Calafiore, M. Di Mauro, C. Canosa, G. Di Giammarco, A. L. Iaco, and M. Contini Early and late outcome of myocardial revascularization with and without cardiopulmonary bypass in high risk patients (EuroSCORE>=6) Eur. J. Cardiothorac. Surg., March 1, 2003; 23(3): 360 - 367. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |