|
|
||||||||
Eur J Cardiothorac Surg 2000;18:380-387
© 2000 Elsevier Science NL
a Health Services Research Unit, Carlos III Health Institute, Madrid, Spain
b RAND Europe, Leiden, The Netherlands
c Department of Medicine, University of Michigan, Ann Arbor, MI, USA
d Department of Health Management and Policy, University of Michigan, Ann Arbor, MI, USA
Received 18 February 2000; received in revised form 31 May 2000; accepted 28 June 2000.
Corresponding author. Unidad de Investigación en Servicios de Salud (UISS), Subdirección General de Epidemiología e Información Sanitaria, Instituto de Salud Carlos III, Calle Sinesio Delgado 6, 28029 Madrid, Spain. Tel.: +34-91-387-7803 extn. 2010; fax: +34-91-387-7896
e-mail: kfitch{at}isciii.es
| Abstract |
|---|
|
|
|---|
Key Words: Coronary revascularization Coronary artery bypass grafting Percutaneous transluminal coronary angioplasty Appropriateness Necessity
| 1. Introduction |
|---|
|
|
|---|
One approach to answering these kinds of questions is the RAND appropriateness method, which has been applied in a number of countries since the mid 1980s to obtain ratings of the appropriateness and necessity of various medical and surgical procedures. In Europe, appropriateness panels using this method to rate coronary revascularization procedures have been carried out in Spain [3], Sweden [4], Switzerland [5], The Netherlands [6] and the United Kingdom [7], while in North America, both the United States [8] and Canada [9] have held such panels. The RAND method is based on a review of the scientific literature and the work of an expert panel which rates the appropriateness, and sometimes the necessity, of a comprehensive list of indications for the procedure in question. Appropriateness criteria have most often been used in retrospective audits of patients who have undergone the procedure, to determine the proportion of those who received inappropriate procedures, that is, to measure the overuse of procedures. Necessity criteria, on the other hand, can be used to measure the underuse of procedures by applying them to patients who were potential candidates for the procedure, to identify those meeting necessity criteria who did not receive the procedure.
Until now, however, the criteria produced using this method have all been from single-country panels, on the theory that differences in values or clinical practice style make it advisable for each country to produce its own appropriateness criteria. In recent years, however, European countries have been moving toward ever greater political, economic and social integration, a trend that is likely to extend to the area of medical care, as well. Just as the introduction of a common currency may lead to reduced economic disparities among member countries, so the development of common tools to measure the quality of medical care has the potential to help reduce clinical practice variations that are unrelated to clinical characteristics of patients. Such considerations led us to consider the feasibility of holding an appropriateness panel made up of specialists from a number of different countries, which would have the added benefit of economies of scale in comparison to carrying out multiple national panels on the same subject. Thus, as part of a European Commission BIOMED Concerted Action on the appropriateness of medical and surgical procedures, we conducted a multinational European panel to develop criteria for the appropriateness and necessity of PTCA and CABG.
| 2. Methods |
|---|
|
|
|---|
Three working documents were produced for the panel process. First, the Swedish Council on Technology Assessment in Health Care (SBU) carried out a comprehensive review and synthesis of the findings of selected English language studies on the efficacy and risks of PTCA and CABG published between April 1993 and December 1997 [10]. This document supplemented earlier reviews of the literature by RAND and SBU, which were also available to the panel [1113]. Second, the panel coordinators prepared a list of 400 detailed clinical scenarios, which described hypothetical patients who might be considered for coronary revascularization. Most of these scenarios were rated separately for the appropriateness of PTCA and for the appropriateness of CABG, giving a total of 740 ratings or indications. The 60 clinical scenarios describing patients with acute myocardial infarction (first 12 h) were rated only for the appropriateness of PTCA, because CABG typically cannot be performed during that time. The list of indications was based on previous lists used in different national-level appropriateness panels, but was reduced to focus on those scenarios that had been shown to represent substantial numbers of real patients when they were applied to patient populations in those countries. The indications were grouped into four chapters representing the primary clinical conditions presented by patients referred for revascularization: chronic stable angina, hospital admission for unstable angina, acute myocardial infarction (first 12 h) and post-myocardial infarction (>12 h28 days). Each chapter was further subdivided by variables describing the extent of vessel disease, ejection fraction, stress test results, surgical risk and other factors. An example of a specific clinical scenario is a patient with severe angina (class III/IV), who has 1- or 2-vessel disease with proximal left anterior descending (PLAD) involvement, a very positive stress test, a left ventricular ejection fraction (EF) between 30% and <50%, who is at high surgical risk. The scenarios specifically excluded patients who had received previous bypass surgery or who had intracoronary stents in place. The third panel document contained a precise definition of each term used in the list of indications, to assure that panelists had the same understanding of what constituted, for example, a very positive stress test or high surgical risk.
The synthesis of the evidence, list of indications and definitions were mailed to each panelist, with the request that they rate each clinical scenario for the appropriateness of PTCA and the appropriateness of CABG on a scale of 19, where 1 meant the procedure was highly inappropriate and 9 meant it was highly appropriate. An appropriate procedure was defined as one in which the expected health benefit (e.g. increased life expectancy, relief of pain, reduction in anxiety, improved functional capacity) exceeds the expected negative consequences (e.g. mortality, morbidity, anxiety, pain, time lost from work) by a sufficiently wide margin that the procedure is worth doing, exclusive of cost [12]. Panel members completed these first-round ratings independently, with no knowledge of the identity of their fellow panel members. The rating sheets were then returned to the project coordinators for data entry and analysis.
Following this first round of ratings, panelists were invited to a meeting in Madrid in December 1998 where they were able to see the results of the ratings (the frequency of panel responses together with their own rating for each indication) and discuss areas of confusion or disagreement. Thirteen of the 15 panelists attended the meeting: six ICs, two NICs and five CVSs. The meeting was conducted in English and led by a moderator experienced in applying the RAND appropriateness method. Based on an analysis of the distribution of ratings for each indication in the first round, the moderator focused the discussion on those areas where panelists seemed to be widely polarized in their appropriateness judgements. As with all panels using the appropriateness method, there was no attempt to force the panel to consensus, although the panelists were encouraged to support their judgements by citing the relevant scientific evidence. During the panel meeting, minor changes were made to the list of indications, with the result that the revised list consisted of 430 clinical scenarios and 842 indications. For example, indications that were originally grouped together for mild/moderate angina (class I/II) were split into two categories, one for angina class I and the other for angina class II. After the discussion of each chapter in the list of indications, panelists rated all the indications in that chapter a second time.
The final appropriateness criteria were based on the median panel rating and level of disagreement for each indication in the second round, using the following definitions: all indications with a median rating of 79, rated without disagreement, were classified as appropriate; those with a median rating of 13, rated without disagreement, were classified as inappropriate; and those with a median rating of 46, as well as all indications rated with disagreement, regardless of the median, were classified as uncertain. An indication was considered to be rated with disagreement when at least four panelists rated it in the 13 range, and at least four panelists rated it in the 79 range.
To produce necessity criteria, a third round of panel ratings was carried out by mail, in which panelists were asked to rate the necessity of performing coronary revascularization for the 288 indications that had previously been classified as appropriate for either PTCA or CABG. A procedure was defined as necessary if it met all four of the following criteria: (1) the procedure is appropriate, i.e. the health benefits exceed the risks by a sufficient margin to make it worth doing; (2) it would be improper care not to offer the procedure to a patient; (3) there is a reasonable chance that the procedure will benefit the patient; and (4) the magnitude of the expected benefit is not small [12]. These indications were rated on a similar 19 scale, in which 1 meant that coronary revascularization was appropriate but not necessary for the particular indication, and 9 meant that it was appropriate and necessary. All indications with a median rating of 79, without disagreement, were classified as necessary for coronary revascularization.
In accordance with the preceding definitions, each clinical scenario in the list of indications was classified as necessary (and therefore appropriate), appropriate (but not necessary), uncertain, or inappropriate. For all indications in which coronary revascularization was classified as necessary, then whichever of the two procedures had previously been classified as appropriate was reclassified as necessary. If both PTCA and CABG had previously been classified as appropriate, then both ratings were changed to necessary. Thus, if both PTCA and CABG are rated necessary for a particular indication, this means that coronary revascularization is necessary for this patient, and the panel considered that there were no clinical grounds for strongly preferring one procedure over the other.
After classifying each indication, a detailed review was made of the entire list to check the internal consistency of the ratings. The purpose of this review was to determine if there were conflicting patterns of recommendations for either procedure. For example, if PTCA was normally rated more appropriate in patients with stenosis of the PLAD artery than in similar patients without PLAD involvement, then any reversal of this pattern was highlighted as a possible inconsistency. Fifteen potential inconsistencies were detected out of the 842 indications rated in the second round. The panelists received a worksheet describing each inconsistency and the clinical question on which it was based, and were asked to consider whether the appropriateness classification should be revised to make the criteria more internally consistent. If a majority of the panelists voted in favor of the revision, then the classification was changed.
| 3. Results |
|---|
|
|
|---|
Table 1 shows the percentage of indications classified in each appropriateness category, by chapter (i.e. clinical presentation). PTCA was considered necessary for over half of all indications in the AMI chapter and for more than one-third of all indications in the unstable angina chapter. Less than one-fifth of the chronic stable angina and post-AMI indications were judged necessary. In the case of CABG, all indications that were rated appropriate for unstable angina were also considered necessary (42%). The largest proportion of inappropriate indications was in the post-AMI chapter for both PTCA (23%) and CABG (23%).
|
The 15 potential clinical inconsistencies (eight for PTCA and seven for CABG) were primarily for cases where the median panel rating was on the borderline between appropriate and uncertain, so that a 1-point shift in rating by one panelist would have changed the appropriateness classification. For each of the inconsistencies that the panelists were asked to consider, at least eight panelists voted in favor of revising the appropriateness classification to make it more internally consistent with the panel's recommendations for similar patients. As a result, nine indications were changed from appropriate to uncertain, five indications from uncertain to appropriate, and one indication from inappropriate to uncertain. The values in Table 1 were calculated after making these changes.
Table 2 shows a subset of the appropriateness and necessity criteria from the chronic stable angina chapter (for patients with class I angina). The row variables describe patient clinical characteristics, such as extent of vessel disease and stress test results, while the columns show the level of surgical risk. The complete appropriateness and necessity criteria, as well as the definitions for each term used in the clinical scenarios, can be obtained from the authors.
|
| 4. Discussion |
|---|
|
|
|---|
It might be expected that panelists from a number of different countries would find it harder to agree on their appropriateness ratings than panelists from a single-country panel, however we did not find this to be the case. We defined disagreement to mean that at least one-third of the panel members rated an indication a 1, 2 or 3, and at least one-third rated it a 7, 8 or 9. The total amount of disagreement measured in this way in the second-round appropriateness ratings was 5%. A comparable all-Spanish panel that rated the appropriateness of PTCA and CABG in 1997 disagreed on substantially more indications (13%) [3], while an all-Dutch panel composed of six interventional cardiologists and six cardiovascular surgeons disagreed on 3.2% of the indications rated [6].
The results of two other multinational appropriateness panels, also sponsored by the BIOMED Concerted Action on the appropriateness of medical and surgical procedures, confirm our positive experience. In Switzerland, a multispeciality panel of 14 experts from nine European countries rated the appropriateness and necessity of upper and lower gastrointestinal endoscopy procedures [14], while in The Netherlands, a panel of 15 urologists from five European countries rated the appropriateness of treatment of benign prostatic hyperplasia (BPH) [15]. Concerns about how differently a multinational panel might rate appropriateness in comparison to a single-country panel led the investigators of the BPH study to conduct an all-Dutch panel concurrently with the multinational one. Interpanel agreement in classifying appropriateness was found to be high, with 84% of the indications classified identically (kappa=0.76). In both of these multinational panels, disagreement within panels was also quite low: about 6% in the endoscopy panel [16] and 1% in the BPH panel [15].
Physicians and policy makers interested in how appropriateness and necessity criteria can be used to improve medical care may have concerns about the reliability and validity of the RAND method. Although panelists are carefully selected and provided with an extensive literature review of the procedure to be evaluated, their ratings will in some sense be subjective and dependent on each expert's knowledge and experience. Thus, the selection of a different group of experts would undoubtedly lead to at least some of the recommendations being classified differently. In the most extensive test to date of the reproducibility of the appropriateness method, experts were randomly assigned to three different panels to rate coronary revascularization. The resulting three-way kappa for the classification of appropriateness was moderately high (0.52), which is about the same as for many diagnostic tests, while the three-way kappa for the classification of necessity was very high (0.83) [17]. The validity of necessity criteria is supported by a study showing that patients who met necessity criteria and did not undergo revascularization had worse outcomes than similar patients who underwent revascularization [18].
There may also be concerns that the inclusion of panelists from different countries could reduce panel reliability. However, we have not found systematic differences by nationality in the appropriateness ratings of the multinational panel (Bernstein, unpublished data). As noted above, a comparison between a Dutch and a multinational panel found very high levels of agreement for BPH indications [15]. Similar comparisons could be made between criteria developed by physicians from one country and those of a multinational panel.
Some clinicians may consider that additional clinical variables should be included in the rating structure, such as the morphological characteristics of the lesions. Several previous panels that rated the appropriateness of PTCA and CABG used this approach by including classification of lesion type (A, B or C) in their lists of indications [4,6]. Our study did not incorporate this variable, however, in line with the American College of Cardiology/American Heart Association (ACC/AHA) guidelines for PTCA [19]. Although lesion type was included in the original 1988 ACC/AHA guidelines for PTCA, patients were subsequently classified as low or high risk candidates for PTCA based on a combination of their clinical characteristics and lesion type. In addition, risk estimates are extremely unstable for specific lesion characteristics [20]. This exemplifies the difficulties that arise in developing criteria when only limited data are available.
The RAND appropriateness method is only one of several techniques that have been used to develop recommendations for treatment decisions. Alternative methods include decision-analysis, meta-analysis and cost-effectiveness analysis. These quantitative techniques usually result in a recommended treatment or an estimated probability of an outcome for different treatment choices. In contrast to an expert panel, decisions analysts try to incorporate only data that has been validated in the literature. However, expert panelists do not base their recommendations solely on opinion, and decision analysts often add expert opinion to their models [21]. We chose to use the RAND appropriateness method because it has been shown to be a reasonably valid and reliable tool for coronary revascularization decisions [17,18].
How, then, might these criteria be used? Historically, they have most often been used as a measure of past performance. For example, the clinical charts of patients who have undergone coronary revascularization are reviewed in order to obtain the information necessary to classify each patient in the list of indications. It can then be determined if the procedure received was appropriate, uncertain or inappropriate according to the panel's recommendations. These types of retrospective chart audits have been carried out in most of the European countries that were represented on our panel, using the criteria developed by their own national-level panels. The proportion of procedures classified as inappropriate in such studies can be considered an approximate measure of overuse. Necessity criteria, on the other hand, can be applied to patients who might have been candidates for coronary revascularization, for example, by studying patients who have undergone coronary angiography to determine which ones meeting a necessity criterion did not receive a revascularization procedure (excluding those who were offered but refused the procedure). This type of study, to measure the underuse of coronary revascularization, has been carried out in the United States [22] and Sweden [23]. One advantage of using criteria developed by multinational panels in these types of studies is that the same set of criteria can be applied to each of the participating countries, allowing cross-national comparisons.
Perhaps the greater challenge is the prospective use of the criteria to help physicians and patients decide when it is appropriate (or necessary) to perform a revascularization procedure. Physicians should be encouraged to consult the criteria when deciding what course of action to recommend to their patients. It should be emphasized, however, that these are only recommendations, representing a combination of the best scientific evidence available together with the judgements of medical experts involved in referring patients for or performing the procedure under study. Appropriateness panels are typically instructed to base their judgements on an average patient presenting to an average physician in an average hospital. Although the list of indications is designed to be highly specific, there may well be special circumstances not reflected in the clinical scenarios that support a different decision. Such departures from the recommendations, however, should not be arbitrary, and physicians should be able to justify their reasons for not following the criteria in particular instances. With these caveats in mind, it is considered that routine consultation of the criteria could well result in a reduction of the large variations in procedure rates that are currently seen in clinical practice.
It should be emphasized that the indications discussed in this paper are only for theoretical combinations of variables describing patient symptoms and diagnostic tests. Patients are not distributed uniformly across the different clinical scenarios, and some of them may represent few real patients. Nevertheless, the large proportions of indications that our panel rated as necessary for acute conditions may suggest potential underuse of coronary revascularization procedures in the population. Even in the United States, where coronary revascularization rates are much higher than in Europe, substantial underuse has been shown to occur [22]. The existence of multinational necessity criteria offers the opportunity to determine if a similar phenomenon is also occurring in European countries.
A major limitation in the area of appropriateness research relates to the difficulties involved in disseminating and using the panel criteria, and particularly on how to keep the panel recommendations up-to-date in light of new scientific evidence. One way to do this is by making the criteria available in a dynamic format such as the world wide web. Ideally, links to the supporting scientific evidence could also be provided where such evidence exists. The criteria developed by the multinational endoscopy panel are currently available on a website (http//www.epage.ch). The complete appropriateness and necessity criteria from the multinational coronary revascularization panel will also soon be available through the Internet. This type of interactive tool is much easier to use than a paper format, and has the added capability of being quickly and easily updated when new evidence becomes available.
In summary, our experience suggests that multinational panels show promise as a feasible and practical way of addressing appropriateness and necessity issues in countries sharing similar levels of socioeconomic development and medical technology. The criteria for coronary revascularization procedures developed by the panel represent the combined judgements of a highly experienced group of experts in cardiology and cardiovascular surgery from five European countries. They can be used as yardsticks to measure performance and to compare appropriateness across countries, as well as guides for decision making. Current formats for presenting these types of criteria, however, have limited their dissemination and use. New ways need to be found to make the criteria more flexible and to keep them updated in accordance with the latest scientific evidence. As Internet technology becomes more widely available and reliable, appropriateness recommendations can be modified to keep pace with the results of new research and can be more easily accessed and used by both physicians and patients.
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
G. Elwyn, A. O'Connor, D. Stacey, R. Volk, A. Edwards, A. Coulter, R. Thomson, A. Barratt, M. Barry, S. Bernstein, et al. Developing a quality criteria framework for patient decision aids: online international Delphi consensus process BMJ, August 26, 2006; 333(7565): 417. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gandjour, I. Neumann, and K. W. Lauterbach Appropriateness of invasive cardiovascular interventions in German hospitals (2000-2001): an evaluation using the RAND appropriateness criteria Eur. J. Cardiothorac. Surg., October 1, 2003; 24(4): 571 - 577. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Beyersdorf Editorial comment Eur. J. Cardiothorac. Surg., October 1, 2003; 24(4): 578 - 579. [Full Text] [PDF] |
||||
![]() |
S. J. BERNSTEIN, P. LAZARO, K. FITCH, M. D. AGUILAR, H. RIGTER, and J. P. KAHAN Appropriateness of coronary revascularization for patients with chronic stable angina or following an acute myocardial infarction: multinational versus Dutch criteria Int. J. Qual. Health Care, April 1, 2002; 14(2): 103 - 109. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |