|
|
||||||||
Eur J Cardiothorac Surg 2006;29:82-88
© 2006 Elsevier Science NL
a Research Concentration in Biological and Medical Sciences, School of Life Sciences, Queensland University of Technology, Brisbane, Australia
b Discipline of General Practice, School of Medicine, University of Queensland, Brisbane, Australia
c Department of Psychology and Neuropsychology, The Prince Charles Hospital, Brisbane, Australia
d Department of Haematology, The Prince Charles Hospital, Brisbane, Australia
e Department of Perfusion Services, Main Operating Theatres, The Prince Charles Hospital, Rode Road, Brisbane Q4032, Australia
Received 29 March 2005; received in revised form 31 August 2005; accepted 7 October 2005.
* Corresponding author. Tel.: +61 7 3350 8705; fax: +61 7 3350 8659. (Email: p.raymond{at}optusnet.com.au).
| Abstract |
|---|
|
|
|---|
20% of test scores used provided acceptable probability based on the number of tests commonly encountered. Investigators must also choose a test battery that minimises shared variance among test scores.
Key Words: Cardiopulmonary bypass Neurocognitive deficits Brain Cerebral complications
| 1. Introduction |
|---|
|
|
|---|
The RCI with adjustment for practice (RCIp) has previously been employed by Kneebone et al. [9] for analysis of cardiac surgical patients. This method provides criteria for meaningful change based on the calculated measurement error for each score. A patient's predicted retest score equals their Time 1 score plus mean practice effect detected in a matched control group. If the difference between the actual and expected retest score exceeds the likely variation based on matched controls, a significant change is considered to have occurred. A disadvantage of this method is the use of mean practice effect rather than individualised practice, as it may not be reasonable to expect all people to show similar effects. Similarly, this method does not consider regression toward the meanthe tendency for people with outlying performance at initial test to drift toward the mean at follow-up. The SRB method generates a regression equation to predict a patient's score change on the basis of Time 1 performance, plus any demographic variables that contribute significantly to the prediction model in matched controls. The expected change is therefore dependent on initial performance rather than mean practice effect.
The aim of this study was to compare the SRB model with the RCIp, and to use the SD and 20% methods as an illustration of the importance of statistical change criteria. In addition, a discussion is presented on the methodology of selecting criteria for the classification of an individual as overall impaired when using a battery of test scores.
| 2. Methods |
|---|
|
|
|---|
|
2.3 Methods of defining change in neuropsychological scores
Four different analysis techniques were used to define significant change in neuropsychological scores.
2.3.1 Standardised Regression-based technique
Regression equations were generated for each neuropsychological score using retest data from the control group. Using standard multiple linear regression analysis, age, sex, education level, and score at Time 1 were evaluated as potential predictors of score at Time 2. For all measures in this series, only performance at Time 1 was shown to be a significant predictor. Results for the regression analysis are presented as regression coefficient and intercept, and standard error of the estimate (SEest), calculated as SEest
= SD2
[1 (r
xx
)2], where SD2 is the SD of post-test scores and r
xx
is the testretest reliability. A patient's predicted post-test score () can be calculated on the basis of their initial score using the formula , where X
1 is the patient's Time 1 score, and b and c are the regression coefficient and intercept, respectively. This data can be used to assess significant change from pre- to post-test using the SRB formula:
|
|
Ninety percent confidence intervals for SRB may be generated using the following approach.
|
|
2.3.2 Reliable Change Index modified for practice
RCI Z scores can be generated from testretest data from the normative control group, using the formula:
|
|
[2(SEm)2], and SEm
= SD1[
(1
r
xx
)], where SD1 is the standard deviation of the baseline score. Practice effect was calculated by change in mean score over the testretest interval, and was analysed for significance using repeated measures t-tests for each measure (p
< 0.05). For each patient, the postoperative minus preoperative score was calculated (X
2
X
1). When this value was greater than ±1.645, a significant change was considered to have occurred. The RCI method can also be used to give 90% confidence intervals, using the following formula: |
|
Both the SRB and RCIp techniques provide a confidence interval for the detection of significant change, for example a 90% level of confidence indicates 5% of cases may be expected to fall above and 5% of cases below the cutoff due to chance rather than real change.
2.3.3 Standard deviation method
According to the SD method, a change in score on any measure is considered significant if it is greater than 1SD of the baseline score of the surgical sample. This may be represented as:
|
|
2.3.4 20% method
Using the 20% change method, a measurement score must change by at least 20% from baseline to be considered significant.
|
|
2.4 Method of defining significant change in an individual
Patients were classified as impaired if they demonstrated significant deterioration on
2 of the nine test scores used. This decision was based on the probable distribution of false changes when using the prediction models (SRB and RCIp). A more detailed discussion of the process used to define overall change in individuals is included.
| 3. Results |
|---|
|
|
|---|
|
|
|
2 declines due to chance from the nine scores used was found to be 0.07. Table 5
shows the estimated probability of change scores due to chance that may be expected in either direction when using a cutoff of ±1.645. Using the criteria of change in
2 test scores, the incidence of postoperative impairment and improvement indicated by each of the four criteria is shown in Fig. 1
. Considerable differences can be seen between methods. The SRB technique yields the greatest number of patients classified as impaired (32.7%). In comparison, the RCIp classified only 16.4%, while the SD and 20% methods detected just 3.6% and 5.5%, respectively. Again, both the SRB and RCIp techniques revealed a small number of patients classified as improved, which was not greater than the 7% predicted by the binomial distribution of change. In comparison, the SD and 20% methods both classified large numbers of patients as significantly improved (65.5% and 69.1%, respectively).
|
|
| 4. Discussion |
|---|
|
|
|---|
Two statistical criteria for change that account for measurement error and practice are presented in this study: the Reliable Change Index, modified for practice, and the Standardised Regression-based technique. Both techniques are simple to apply, and are based on similar principles. They do, however, differ in a number of ways that may lead to different conclusions. The RCIp accounts for measurement error and mean practice effects detected in controls for each measurement score. This technique tends to identify more patients as declined on the nine scores than either the SD or 20% methods. However, it does not account for individual practice or regression toward the mean. The potential advantages of the SRB technique lie not only with the influence of these factors in the prediction of retest performance, but also with the inclusion of demographic variables such as age and education. The equation may also include the influence of tests of mood such as anxiety and depression, which may have an effect on retest performance. In this series, however, only Time 1 performance was found to be significant in the prediction of retest performance in controls. Previous work with the SRB has shown only small influence of demographics [12], although when the influence is significant the effect should be entered into the equation to obtain optimal results. With a suitably sized, well-matched control group, the SRB prediction model will be better suited to the full range of patients being studied.
Because it predicts retest performance on the basis of initial performance, the SRB appears to give narrower detection intervals than the RCIp. This is due mainly to accounting for regression to the mean and is consistent with previous studies [12]. As a result, the SRB tended to give a higher sensitivity for most scores used: the SRB identified 27.9% of patients as declined on the attention/mental control domain, nearly double the rate given by the RCIp or fixed cutoff techniques. Further, the SRB was the only technique that classified a large number of patients as impaired on the higher-order scores of cognitive functioning. These scores, such as general cognitive functioning, are more likely to reflect impairment that may affect day-to-day functioning in an individual. Despite the narrower detection intervals for the SRB compared to the RCIp, there was little variation in the detection of improvement. Both techniques accounted well for practice as there were few improvements detected, and this was not above the 5% detection due to chance. The SRB technique, therefore, has the advantage of greater sensitivity in the detection of decline without increased classification of improvement. This does not necessarily reflect an ideal accuracy of detection; indeed more complex formula have been proposed as both the SRB and RCIp techniques have been criticised for not adequately dealing with measurement error [13]. However, both methods are simple to use and conceptualise, and are vastly superior to the traditional SD or 20% methods. While the debate over the best method for defining significant change remains open, the use of either statistical change criteria discussed here deals well with test imperfections, and makes considerable advancements over traditional cutoff techniques.
Two domain scores in this series, reaction time and spatial processing, highlight the importance of accounting for both reliability and stability. The reaction time score had poor retest reliability, and when using the fixed cutoff techniques there was an incidence of decline in this score similar to the SRB and RCIp techniques. This incidence of decline is high, relative to the much lower incidences detected using the fixed cutoff techniques across other domain scores. This may be because the large normal fluctuations that occur in retest scores due to poor reliability more easily exceed the fixed cutoff used by the SD or 20% methods. In contrast, spatial processing while having poor reliability also had considerable practice effect. The two fixed cutoff techniques recorded low declines, but recorded considerable improvements, as the effect of practice for many patients would have exceeded the fixed cutoff.
One further question remains unanswered when defining individual change in cognitive performance: on how many tests must a patient demonstrate decline before they are considered to have shown significant overall change? The commonly used criteria include
1 test score,
2 test scores, and
20% of test scores. Again, these definitions are based on arbitrary decisions rather than on a theoretical underpinning. Using either the SRB or RCIp to assess decline on any one measurement score, the probability of detecting a decline entirely due to chance equals 0.05. Obviously, the more measurements used in a test battery, the greater the chance of recording a decline in any one or more scores due to chance. The choice of optimal criteria for the detection of overall change was analysed using the binomial distribution of false changes across a range of scores. The aim was to provide rational criteria for a cutoff based on describing the probability of decline detected entirely by chance. It must be stressed that the binomial distribution of score changes used here provides only an estimate of false changes as it assumes independence of testsrarely the case in psychometric assessment. It does, however, provide some theoretical underpinning in the choice of cutoff rather than relying on an arbitrary number. In this study the estimated probability of false decline in
2 scores from the nine used was calculated to be 0.07. Compare this to the probability of false decline in
1 score (p
= 0.37), which is unacceptably high. As a guide, the criteria of
20% of test scores used was mostly found to provide acceptable probability on the range of score numbers commonly encountered (Table 5).
The other factor that must be considered when describing change is shared variance among tests used. This occurs when two or more tests used in the overall analysis measure the same or similar cognitive functions. When this occurs, a decline in one score indicates likely decline on other scores that share variance. This will in turn inappropriately increase the probability of reaching any chosen cutoff in the detection of change. Shared variance needs to be controlled for by choosing a broad range of neuropsychological tests assessing independent constructs, and through presentation of sum scores for a domain where possible rather than relying on the presentation of a large number of related subtests. Where changes in overall function are not the focus of investigation, for instance therapies targeting specific brain functions, then more detailed presentations of similar, targeted neuropsychological assessments would be the desirable approach.
In conclusion, it has been demonstrated that the choice of statistical models used to assess post-event cognitive decline has a strong influence on reportable outcomes. Two methods employing statistical change criteria, namely the SRB and RCIp, demonstrated greater sensitivity in the detection of decline compared to fixed cutoff techniques. These methods are more likely to reflect true change in the performance of an individual, as they detect significant variation from the spread of score changes that may be reasonably expected over time, based on retest data from matched controls. The SRB in particular was shown to be a useful prediction model as it provides an estimate of retest performance based on initial score for an individual, and as such considers individual practice effects and regression toward the mean. This technique also has the advantage of accounting for the effects of demographic variables such as age and education, should they influence the prediction model. When these models were used to assess whether a person could be classified as significantly impaired through the use of a battery of sub-tests, it could be seen that the number of tests used to define change has a strong influence on reported outcomes. Investigators should minimise shared variance by avoiding the presentation of similar sub-tests in the analysis. From a suitable selection of tests, the definition of overall change needs also to be based on sound statistical criteria. When using either RCIp or SRB, the cutoff of
20% of test scores used was found to provide acceptable probability on the range of score numbers commonly encountered.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
P. A. Barber, S. Hach, L. J. Tippett, L. Ross, A. F. Merry, and P. Milsom Cerebral Ischemic Lesions on Diffusion-Weighted Imaging Are Associated With Neurocognitive Decline After Cardiac Surgery Stroke, May 1, 2008; 39(5): 1427 - 1433. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. P. Alston, R. A. Kumar, C. Cann, J. Hall, P. Sudheer, and A. Wilkes IL-18 and SC5b-9 for predicting neurocognitive dysfunction after cardiopulmonary bypass Br. J. Anaesth., September 1, 2007; 99(3): 444 - 445. [Full Text] [PDF] |
||||
![]() |
J. R. Keith, D. J. Cohen, and L. B. Lecci Why Serial Assessments of Cardiac Surgery Patients' Neurobehavioral Performances are Misleading Ann. Thorac. Surg., February 1, 2007; 83(2): 370 - 373. [Full Text] [PDF] |
||||
![]() |
P. D Raymond, M. Radel, M. J Ray, A. D Hinton-Bayre, and N. A Marsh Investigation of factors relating to neuropsychological change following cardiac surgery Perfusion, January 1, 2007; 22(1): 27 - 33. [Abstract] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |