Journal of Pediatric Psychology, Vol. 27, No. 2, 2002, pp. 121-131
© 2002 Society of Pediatric Psychology
Quantifying Practice Effects in Longitudinal Research With the WISC-R and WAIS-R: A Study of Children and Adolescents With Hemophilia and Male Siblings Without Hemophilia
1 Tulane University Health Sciences Center, 2 Boston University School of Public Health, 3 University of Iowa College of Medicine, Iowa City, 4 University of Texas Medical School, Houston, 5 University of California, San Diego, 6 Rho, Inc., Chapel Hill, North Carolina, 7 Childrens Hospital Los Angeles, 8 Williamsburg, Virginia, 9 University of Texas, San Antonio
All correspondence should be sent to Patricia A. Sirois, Department of Pediatrics, Tulane University Health Sciences Center, 1430 Tulane Avenue (TW-41), New Orleans, Louisiana 70112. E-mail: psirois{at}tulane.edu .
| Abstract |
|---|
|
|
|---|
Objective: To quantify practice effects associated with annual administrations of WISC-R and WAIS-R in children and adolescents with and without hemophilia.
Methods: Participants were young men (age: 7-19; 80 with hemophilia, 30 siblings) enrolled in the Hemophilia Growth and Development Study. Participants with hemophilia completed age-appropriate Wechsler scales at baseline and at four annual follow-ups; the siblings, at baseline and one 2-year follow-up. Regression analyses were used to quantify average changes in scores, adjusting for variables related to test performance.
Results: Consecutive annual evaluations were free of significant practice effects for 4 years with the Verbal Scale and for 2 years with the Performance Scale. VIQ decreased, and PIQ increased over time. Baseline VIQ was related to changes in VIQ; baseline PIQ and number of test-specific retests were related to changes in PIQ.
Conclusions: The findings support use of Wechsler scales for annual evaluations to monitor cognitive development in children and adolescents.
Key words: longitudinal assessment; retest effects; WISC-R; WAIS-R; chronic illness; sibling comparisons.
| Introduction |
|---|
|
|
|---|
Repeated assessments of cognitive abilities in children and adolescents are increasingly common in research and applied settings, particularly in clinical trials of therapies to treat conditions such as cancer and HIV disease. Children and adolescents may demonstrate improvements in performance on successive administrations of psychological tests due to previous experience with the test materials rather than to improvements in cognitive performance. Thus, standard scores that do not increase over time may not necessarily reflect cognitive stability because the gain due to previous experience may compensate for a slowing of cognitive growth. Reports of changes in mean scores following repeated administrations of measures of general intelligence are available in the literature, but the data are difficult to integrate because of differences in retest interval, statistical methods, and characteristics of study participants. Reports of retest effects associated with successive annual assessments are rare. Regardless of the purpose of the assessments, knowledge of the relative contribution of retest effects is important to an understanding of the meaning of changes in test performance over time.
Research on test reliability in populations of children and adolescents is
based largely on data obtained in school settings, where periodic assessments
are employed in educational planning and placement decisions. Tuma and
Appelbaum (1980
) were the
first to report data describing practice effects on the Wechsler Intelligence
Scale for Children-Revised (WISC-R;
Wechsler, 1974
). Using a
6-month retest interval, the authors reported significant gains in Full Scale
IQ (FSIQ; 4.73 points) and Performance Scale IQ (PIQ; 7.82 points) but no
significant gain in Verbal Scale IQ (VIQ; 1.09 points). These changes were
smaller than those reported in the test manual for a 1-month retest interval
(7, 9.5, and 3.5 points, respectively;
Wechsler, 1974
, p. 31),
indicating that the reliability of the WISC-R is inversely related to the
length of the retest interval. A study of a large birth cohort showed that
FSIQ increased 5.3 points over 7 years for children who completed the WISC-R
every 2 years between the ages of 7 and 13
(Moffitt, Caspi, Harkness, & Silva,
1993
).
Schuerger and Witt (1989
)
reviewed textbooks, journal articles, and test manuals and found 34 studies of
test-retest reliabilities on the WISC and WISC-R (Wechsler,
1949
,
1974
), the Wechsler Adult
Intelligence Scale (WAIS and WAIS-R; Wechsler
1955
,
1981
), and the Stanford-Binet
Intelligence Scale, excluding the fourth edition (Terman & Merrill,
1937
;
1960
). Only 5 of the 34
studies included a retest interval of approximately 1 year (range = 10-20
months), only 6 included the WISC-R, and none of the studies of the WAIS-R
included participants under the age of 20. Multiple regression procedures were
employed; the dependent variable was the average test-retest correlation in
each study, and the independent variables were characteristics of the studies,
including sample size, retest interval, and age and gender of participants.
The results indicated that the tests were highly reliable (mean r
=.82) and that reliability increased with older age at first testing
(r =.70-.96) and decreased with increasing length of retest interval
(r =.89-.64).
Horton (1992
) presented a
brief review of the literature regarding practice effects on the WAIS-R. He
noted that improvements in IQ scores are inversely related to the length of
the retest interval (consistent with previous studies of the WISC-R) and
inversely related to age, with younger individuals showing larger gains than
older individuals. In a study of healthy adults (ages 57-85) who completed a
short form of the WAIS-R annually for 2 years, Mitrushina and Satz
(1991
) found small decreases
in Verbal Scale scores at all ages, increases in Performance Scale scores for
adults 65 and younger, and either stable or declining Performance scores for
those older than 65.
We found only four reports addressing the issue of transition between
Wechsler test instruments, and only one of these concerned transition between
the WISC-R and WAIS-R (Usner &
Fitzgerald, 1999
). The authors examined the data of 69 adolescents
who aged into the WAIS-R during participation in the Hemophilia Growth and
Development Study (HGDS). They found significant changes in FSIQ and PIQ but
not in VIQ at the time of transition to the first WAIS-R (FSIQ, -3.77 points;
PIQ, -7.35 points; and VIQ, +0.7 points). All WAIS-R subtest scores, except
Digit Span, decreased significantly (range = -0.7 to -3.4 points). These
results are consistent with reports of directional trends in IQ scores at the
time of transition from the WPPSI to the WISC-R
(Mulhern, Ochs, & Fairclough,
1992
; Neyens & Aldenkamp,
1997
) and consistent with Rasbury, Mc Coy, and Perry's
(1977
) finding of decreases in
all WISC-R subtest scores compared to the WPPSI. Usner and Fitzgerald
(1999
) attributed their
findings to loss of benefit derived from increasing familiarity with the items
of the Performance Scale of the WISC-R and to differences between the WISC-R
and WAIS-R in the scaling of age norms used to calculate the subtest scaled
scores.
This study is a secondary analysis of data from the HGDS, a longitudinal
study of male children and adolescents designed to examine changes in
psychological development, physical growth, and neurological functioning
(Hilgartner et al., 1993
). The
HGDS included individuals with hemophilia and HIV as well as those with
hemophilia but without HIV (HIV ). At baseline, the HIV group
was within the average range of general intelligence
(Loveland et al., 1994
),
consistent with other studies of children with hemophilia
(Sirois & Hill, 1993
;
Whitt et al., 1993
). Their MRI
and EEG results were normal (Mitchell et
al., 1993
; Sirois et al.,
1998
). Their abnormalities on the neurological examinations at
baseline and during 4 years of follow-up included muscle atrophy and
coordination and gait disturbances; these findings were not associated with
histories of seizure, head trauma, or intracranial bleeding but were the
result of hemophilia-related joint disease
(Bale et al., 1993
;
Mitchell et al., 1997
). Thus,
the HIV cohort of the HGDS provides an opportunity to explore
longitudinal changes in psychological test performance in children and
adolescents with a chronic illness (hemophilia) but without a neurological
condition that might cause deterioration of intellectual abilities. Based on
data from previous reports, we proposed four hypotheses:
- VIQ will remain relatively stable, and PIQ will increase significantly over
time.
- Retest effects will be related to baseline IQ
(Rasbury et al., 1977
) and to
age at baseline (Horton, 1992
;
Mitrushina & Satz, 1991
;
Schuerger & Witt,
1989
).
- Increases in PIQ will be related to increases in scaled scores on
particular subtests of the Performance Scale, for example, Picture Arrangement
(Tuma & Appelbaum, 1980
)
and Object Assembly (Kaufman,
1979
).
- At the time of transition from WISC-R to WAIS-R, there will be minimal
change in VIQ, and PIQ will decrease
(Neyens & Aldenkamp, 1997
;
Usner & Fitzgerald, 1999
).
Declines in PIQ will be greater for adolescents with more frequent exposure to
the WISC-R before completing their first WAIS-R.
| Method |
|---|
|
|
|---|
Participants
Participants in the HGDS included male children and adolescents with hemophilia (n = 333) and biological brothers (full or half-brothers; n = 47) living in the same household as those with hemophilia. They were recruited from 14 hemophilia treatment centers in the United States. Institutional review board (IRB) approval for protection of human subjects was obtained at all centers, and written informed consent and assent were obtained from each participant, parent, or legal guardian, according to local IRB requirements. Details of the eligibility criteria, design, methods, data management, and quality control procedures are presented in Hilgartner et al. (1993
For this analysis, 80 participants with hemophilia and 30 male siblings
without hemophilia were selected for study. The siblings were included as a
comparison group because hemophilia could have subtle adverse effects on
measures of cognitive performance (Loveland
et al., 2000
; Sirois et al.,
1998
; Whitt et al.,
1993
). Individuals with hemophilia and HIV (n = 207) were
excluded to avoid the potentially confounding influence of advancing HIV
disease on longitudinal test performance
(Loveland et al., 2000
).
Participants with invalid (defined as validity rating > 1; see Procedures)
or missing Wechsler data at any time of measurement were excluded to maintain
uniformity of the retest interval and number of retests. Of those with
hemophilia, 1 was excluded for invalid data, 33 for missing data, and 12 for
both reasons. Of the siblings, 11 were excluded for invalid data and 6 for
missing data. At baseline, the participants with hemophilia who were included
in the study were similar in VIQ (M = 106.0 vs. 104.1, t =
0.617, p =.54) but achieved higher PIQ scores (M = 108.5 vs.
102.2, t = 2.238, p =.03) than those who were excluded.
Participant characteristics relevant to this study are shown in
Table I. Parental education
differs between the groups because the siblings were drawn from a subset of
all families enrolled in the HGDS. A history of academic problems was defined
as having repeated a grade in school or current placement in a remedial
special education program. Medical personnel in hemophilia treatment centers
instruct parents to obtain prompt treatment for any injury to the child's head
to avoid the potential consequences of intracranial bleeding. In the HGDS,
possible head trauma was recorded if parents reported either that their son
saw a physician or infused blood-clotting factor because of an injury to the
head. Nelson et al. (1999
)
studied the incidence and prevalence of intracranial hemorrhage in the HGDS
sample. At baseline, 38% of participants reported a history of head trauma,
but only 5% of these showed evidence of intracranial bleeding on the MRI. No
incidence of reported head trauma during the 4-year follow-up resulted in
skull fracture or required surgery. The authors thus concluded that on-demand
treatment with blood-clotting factor was sufficient to limit intracranial
bleeding and that reported head injuries were not severe. In this analysis,
among participants with hemophilia, those with a history of possible head
trauma at baseline and those without such history were similar and within age
expectations in VIQ (M = 103.5 vs. 109.1, respectively, p
=.15) and PIQ (M = 106.0 vs. 111.1, p =.13). This result,
combined with Nelson et al.'s findings, supports the view that, in this study,
reported head trauma in participants with hemophilia reflects parents' efforts
to treat quickly any injury that might lead to intracranial bleeding, not that
the rate of head trauma is higher than in the siblings without hemophilia.
Nevertheless, because of the potential consequences of head trauma in children
with hemophilia and because possible head trauma was associated with lowered
neuropsychological performance in a previous analysis of baseline data from
the HGDS (Sirois et al.,
1998
), we chose to control for this variable in the statistical
analyses.
|
Procedures
Parents, legal guardians, or older participants were interviewed at
baseline to obtain developmental and educational histories and parents' level
of formal education. The Wechsler scales were administered as part of a
comprehensive neuropsychological battery
(Stehbens et al., 1997
). The
WISC-R was employed for participants ages 6-16 and the WAIS-R for those 17 and
older. The data for the cohort with hemophilia were collected at baseline and
1-year intervals for 4 years, providing five successive annual assessments.
The data for the sibling cohort were collected at baseline and one 2-year
follow-up, providing two assessments over a 2-year interval. The examinations
were performed by psychologists or psychological examiners experienced in
conducting standardized assessments with children and adolescents. At the end
of each session, examiners rated the validity of the data on a scale of 1
(valid) to 4 (invalid). Only those data rated 1 (valid) were included in this
analysis.
Statistical Methods
The Wechsler VIQ, PIQ, and subtest scaled scores were included in the
analyses. Age-scaled subtest scores were used in the analyses of WAIS-R data.
The Mazes subtest of the WISC-R was excluded because there is no subtest
analogous to it in the WAIS-R. Descriptive statistics and simple correlations
(Pearson r) were used to summarize the data. Paired t tests
were used to compare each of the baseline and annual follow-up scores with
each other. When testing an increase or decrease, as indicated in the text, we
used a one-sided test; when testing a difference, we used a two-sided
test.
The following variables, relevant to the hypotheses of the study, were
included as potential covariates: baseline scores (VIQ, PIQ, and subtest
scaled scores), age at baseline, test administered (WISC-R vs. WAIS-R), number
of test-specific retests for each test version, and number of WISC-Rs
completed before transition to the WAIS-R. In addition, the following
variables, selected from earlier reports of HGDS data
(Loveland et al., 2000
;
Sirois et al., 1998
), were
included to control for their potential influence on test performance:
parent's level of formal education (father's if present, otherwise mother's
education), history of academic problems at baseline, history of possible head
trauma at baseline, and possible head trauma while on study. Test taking
before enrollment into the HGDS was not included in the analyses.
Regression analysis was used to quantify the average change in test scores
with each repeated administration and to determine whether retest performance
might be predicted from baseline values of the selected covariates. We used
step-wise and all-subsets methods to determine the significant covariates. The
step-wise analysis used forward selection to determine which variables (one at
a time) should be excluded from the model. The regression examined all
possible combinations of covariates that could be included in the final model.
Interactions between covariates were evaluated to determine whether any
variable affected the retest effect differently for subgroups of other
variables. Transformations (logarithmic, quadratic, and square-root) of the
independent variables (i.e., nonlinear terms) were included in linear analyses
and compared to the linear model to determine whether any nonlinear
relationships existed in the data. Results with p
.05 were
considered significant. When correction for multiple comparisons was relevant,
we used the Bonferroni method. All analyses were conducted using SAS version
6.12 (SAS Institutes, Inc.) or S-PLUS version 3.3 or 4.0 (Mathsoft, Inc.).
| Results |
|---|
|
|
|---|
Cohort With Hemophilia
Covariates. Baseline VIQ was the only covariate significantly related to changes in VIQ over time. Baseline PIQ and number of test-specific retests were the only two covariates significantly related to changes in PIQ. The variables of baseline age, parental education, history of academic problems, and possible head trauma before or during study were not significantly related to longitudinal changes in test performance. The issue of head trauma in patients with hemophilia is important because of the potential for adverse intellectual outcomes, but our data do not confirm such outcomes as related to parental reports of possible head trauma. There were no significant interactions between any of the covariates in the analysis; therefore, the covariate effects may be interpreted directly. We tested the interaction between baseline IQ and practice effect to determine whether repeated testing for individuals with different baseline IQ scores produced different practice effects. The model was not significant, indicating that the effect of repeated testing was relatively equivalent across the range of baseline IQ scores in this study (Table I).
VIQ and PIQ. The unadjusted means, standard deviations, and
standard errors of measurement of the VIQ and PIQ scores at baseline and each
annual follow-up are shown in Table
II. VIQ decreased slightly, and PIQ increased steadily during the
4 years of study. The longitudinal changes in VIQ and PIQ from one time of
measurement to another are presented in
Table III; p values
are shown for the paired t tests, and Pearson correlations
(r) are provided to show the stability of the measures from year to
year. The scores were highly correlated during the four years of study. Both
VIQ and PIQ were stable (r =.78-.91 and.72-.88, respectively), and
the stability of both scores decreased with increasing time since baseline
(r =.82-.78 and.83-.72, respectively). The magnitudes of change in
VIQ and PIQ at the end of the second year (-1.94 and 4.8 points, respectively)
were less than or comparable to the average standard errors of measurement
reported for the WISC-R after a 1-month retest interval (3.6 and 4.66 points,
respectively; Wechsler, 1974
,
p. 30) and for the WAIS-R after an interval of 1-7 weeks (2.74 and 4.14
points, respectively; Wechsler,
1981
, p. 33).
|
|
As shown in Table III, the only significant difference in mean VIQ was between the scores obtained at baseline and the fourth annual follow-up (-2.63 points, p =.034). PIQ increased significantly over baseline beginning with the second annual follow-up (4.8 points, p =.0001), but the changes were not significant between each successive annual evaluation. Instead, the increases in PIQ were significant only between the scores obtained at annual 1 versus 2, annual 1 versus 3, annual 1 versus 4, and annual 2 versus 4. This pattern of results suggests a cumulative effect of experience such that three exposures over 2 years were required to produce a significant improvement in PIQ. The finding of a significant difference in PIQ between annual 1 versus 2 but not between annual 2 versus 3 or annual 3 versus 4 suggests a declining benefit from repeated experience with the measures. Although a square-root model provided some theoretical appeal in this context, it did not provide significantly different results from the linear model we used; thus, the simpler (linear) model was chosen. This conclusion was based on testing the difference in the significance of the models as well as examining residual plots. If, however, one were to expand the analysis to a period longer than 4 years, we recommend that a nonlinear model be considered to capture this effect.
Subtest Analysis. Table IV shows the unadjusted means and standard deviations for the Verbal and Performance Scale subtest scores (WISC-R and WAIS-R combined). The Vocabulary score declined significantly over time (average change = -0.23 points), but the changes in the remaining Verbal Scale subtests were not significant. All of the scores on the Performance Scale subtests increased significantly (average change = 0.21-0.44 points).
|
Transition from WISC-R to WAIS-R. Fourteen (n = 14) adolescents with hemophilia reached their seventeenth birthday during the study. Their mean VIQ scores increased from 103.3 on the last WISC-R to 107.5 on the first WAIS-R (SD = 18.6 vs. 19.0, SEM = 5.2 vs. 5.3, respectively). Their mean PIQ scores declined from 115.8 on the last WISC-R to 109.8 on the first WAIS-R (SD = 17.0 vs. 16.2, SEM = 4.7 vs. 4.5, respectively). The changes in VIQ (4.2 points) and PIQ (-6.0 points) were not significant. The number of times a participant completed the WISC-R was not related to his performance on the first WAIS-R.
Final Models: WISC-R and WAIS-R Combined. The final models, using combined WISC-R and WAIS-R IQ scores at baseline and each annual follow-up, were: VIQx = 23.63 + 0.76(VIQ0), and PIQx = 13.86 + 0.87(PIQ0) + 2.37(rep) (p <.00005 for the "rep" variable), where IQx is the IQ at annual follow-up x, IQ0 is the IQ at baseline, and rep is the number of repeated administrations. Although WISC-R and WAIS-R scores were combined in these models, the test versions were treated separately with respect to repeated administrations to determine the effect of test-specific retesting. Thus, for a participant who aged into the WAIS-R, rep = 0 for the first administration of the WAIS-R. The models indicate that VIQ at retest was not significantly related to the number of repeated assessments, but PIQ increased by 2.37 points, on average, with each annual evaluation. The R2 for both VIQ and PIQ was.64, indicating that 64% of the variance in retest VIQ was accounted for by baseline VIQ, and 64% of the variance in retest PIQ was explained by baseline PIQ and the number of test-specific retests.
Final Models: WISC-R and WAIS-R Separately. Using the WISC-R only (283 observations), the final models were VIQx = 23.19 + 0.75(VIQ0), and PIQx = 14.65 + 0.86(PIQ0) + 2.31(rep) (p <.00005 for the rep variable). Using the WAIS-R only (37 observations), the final models were VIQx = 15.12 + 0.85(VIQ0), and PIQx = 6.69 + 0.93(PIQ0) + 3.25(rep) (p =.047 for the rep variable). Thus, there were no significant changes in VIQ following repeated annual assessments with either test, but PIQ increased, on average, by 2.31 points for every annual WISC-R and 3.25 points for every annual WAIS-R.
Comparison With Sibling Cohort
VIQ and PIQ. The siblings demonstrated a nonsignificant decrease
of 0.07 points in VIQ after the 2-year retest interval (M = 104.37
± 15.61 at baseline vs. 104.30 ± 15.89 at retest, p
=.49). This change was not significantly different from the decrease of 2.21
points in VIQ for the group with hemophilia at their first annual retest
(p =.48). There was a significant increase of 3.8 points in the
siblings' PIQ at the 2-year follow-up (M = 105.63 ± 15.01 at
baseline vs. 109.43 ± 16.28 at retest, p =.023), but this
increase was not significantly different from the increase of 1.66 points
observed in the group with hemophilia at their first annual retest (p
=.46). While there is insufficient statistical power to verify whether
children and adolescents with hemophilia and their male siblings without
hemophilia have equivalent retest effects, these results provide reassurance
of consistent trends for VIQ and PIQ with repeated testing.
Subtest Analysis. The siblings demonstrated no significant differences in performance on the Verbal Scale subtests following the 2-year retest interval. The only significant change among the Performance Scale subtests was an increase of 1.24 points on the Object Assembly task (M = 10.33 ± 3.07 at baseline vs. 11.57 ± 3.82 at retest, p =.034).
Final Models: WISC-R and WAIS-R Combined. The final models for the sibling cohort, using combined WISC-R and WAIS-R scores at baseline and the 2-year follow-up, were VIQ = 6.93 + 0.93(VIQ0) and PIQ = 17.94 + 0.87(PIQ0). The R2 for VIQ was 0.84 and for PIQ was 0.64. These results indicate that, for the siblings, baseline VIQ accounted for 84% of the variance in retest VIQ, and baseline PIQ accounted for 64% of the variance in retest PIQ.
| Discussion |
|---|
|
|
|---|
The findings of this study are consistent with previous reports that Wechsler VIQ and PIQ scores are highly reliable over time and that the reliabilities of both VIQ and PIQ are inversely related to the length of the retest interval. The results support hypothesis 1 that VIQ remains relatively unchanged while PIQ increases with successive annual assessments. Although VIQ decreased in the cohort with hemophilia, the decline was not statistically significant until the fourth annual follow-up, that is, after participants had completed the Verbal Scale on five occasions over a 4-year period. PIQ increased with each annual evaluation, but the improvements did not reach statistical significance until the second annual follow-up, that is, after participants had completed the Performance Scale on three occasions over a 2-year period. PIQ continued to increase during the next 2 years but less than in the first 2 years. The gains were not significant between the second and third years or between the third and fourth years. These findings suggest that there is a declining benefit from repeated experience with the test materials. The sibling group demonstrated a nonsignificant decrease in VIQ and a significant increase in PIQ after a 2-year retest interval. The magnitudes of change were not significantly different from the changes in the group with hemophilia after a 1-year interval, providing reassurance of consistent trends in VIQ and PIQ with repeated testing.
Hypothesis 2 is partially supported by the data in that longitudinal
changes in VIQ and PIQ were related to participants' IQ scores at baseline.
The data do not support the hypothesis that retest performance is related to
age at baseline for school-age children and adolescents. The discrepancy is
likely due to differences in the populations studied (older adults in the
Mitrushina & Satz, 1991
,
study) and to the small number of studies of retest effects on the WISC-R and
lack of data about adolescents' performance on repeated administrations of the
WAIS-R (Schuerger & Witt,
1989
). Differences in the age ranges of norms for children versus
adults may also affect interpretation of test results obtained from
cross-sectional and longitudinal studies. For example, 3-month intervals are
used in the calculation of IQ scores on the WISC-R versus 2- to 10-year
intervals for IQs on the WAIS-R.
The results of the subtest analyses support hypothesis 3. For the cohort with hemophilia, significant improvements occurred in all of the Performance Scale subtest scores, and the siblings showed significant gains on the Object Assembly task. We did not hypothesize a change in VIQ, but the data show that all but two of the Verbal Scale subtests (Information and Digit Span) declined over time. The decline in the Vocabulary subtest, though small, was significant between baseline and the end of the fourth annual evaluation, consistent with the decline in VIQ over the same period.
The data partially support hypothesis 4 concerning the expected effect of
transition from one age-appropriate Wechsler scale to the other. For the
subset who aged into the WAIS-R, we expected to find relatively equivalent VIQ
and significantly lower PIQ scores on the first WAIS-R under the assumption
that changes in the stimuli of the Performance Scale tasks would significantly
reduce the benefit associated with familiarity with the items of the WISC-R.
Instead, we found an increase in VIQ and a decrease in PIQ, but neither change
was significant. The number of times the WISC-R was completed before
introducing the WAIS-R was unrelated to performance on the first WAIS-R.
Although we had limited statistical power to examine transition effects in the
subset that aged into the WAIS-R, our findings that VIQ increased and PIQ
decreased at the time of transition are consistent with data available from
the four previous studies of test transition effects on Wechsler scales
(Mulhern et al., 1992
;
Neyens & Aldenkamp, 1997
;
Rasbury et al., 1977
;
Usner & Fitzgerald,
1999
).
The findings of this study provide support for the use of the Wechsler scales for annual evaluations to monitor cognitive development. More specifically, consecutive annual evaluations with the Verbal Scale for at least 4 years and with the Performance Scale for at least 2 years are essentially free of significant practice effects. By the end of the third annual evaluation, the increase in PIQ exceeded the average standard errors of measurement for the WISC-R and WAIS-R, and by the end of the fourth year, PIQ had increased by approximately 7.5 points. These results, combined with the lack of a transition effect between the WISC-R and WAIS-R, indicate that the annual improvements in PIQ after the first 2 years were influenced more by cumulative experience with the tasks than by random error. From these data, it seems likely that, for PIQ, the greater benefit of retesting lies not in the recollection of specific items on the Performance Scale but in the potential to learn strategies for success that are later generalized to new stimuli.
Two possibilities may account for the decline in VIQ observed in this
study. Previous studies revealed that children and adolescents with hemophilia
perform at age level on measures of general intellectual ability but below age
expectations on measures of academic achievement and other cognitive abilities
such as attention and visual processing
(Loveland et al., 2000
;
Sirois & Hill, 1993
;
Whitt et al., 1993
). The
reasons for the finding of lower achievement are not known but may be related
to school absenteeism. In a study of elementary school children (ages 6-12)
with hemophilia, those with higher numbers of bleeding episodes missed more
days of school and earned lower scores on measures of mathematics and overall
achievement than those with fewer bleeding episodes
(Shapiro et al., 2000
).
Performance on language-based tasks depends on prior learning and is
influenced by formal education. If children with hemophilia have lower school
achievement, they may not keep up with their peers in areas such as new
vocabulary; thus, their scores on standardized measures of verbal ability
might decline over time. Alternatively, language-based tasks may be less
interesting to children and adolescents than the more perceptually oriented,
hands-on problems of the Performance Scale. Children may perform at their best
when verbal tasks are novel but become less motivated with succeeding
presentations of the same items.
Although the group differences were not statistically significant, the
siblings gained more in PIQ and lost less in VIQ on their first retest despite
a longer retest interval than the group with hemophilia. Subtle difficulties
in visual processing (Whitt et al.,
1993
) and acquisition of verbal skills
(Loveland et al., 2000
;
Sirois & Hill, 1993
) in
the group with hemophilia may have contributed to the differences in PIQ and
VIQ, respectively.
This study is limited by the single-sex sample, by the size of the sibling
comparison group, by the small subset who aged into the WAIS-R, and by the
range of IQs in the sample. These factors all reduce the generalizability of
our findings. The fact that the data were obtained with the WISC-R presents a
potential limitation, given that its successor, the WISC-III
(Wechsler, 1991
), is currently
in standard use. A study of the stability of the WISC-III over a 3-year
interval (Canivez & Watkins,
1998
), however, indicated that the test-retest reliabilities were
as high or higher than the WISC-R and there were no significant changes in IQ
scores. These results are consistent with Vance, Hankins, and Brown's
(1987
) findings with respect
to changes in the WISC-R after 3-year intervals and with the results obtained
in this study. Thus, the pattern of findings in our study likely will be
replicated in studies of retest effects on the WISC-III.
Several questions relevant to longitudinal research in pediatrics remain for future study. What are the effects of repeated testing with measures of infant developmental status? Do preschool-age children show the same patterns of change in test performance as older children and adolescents? How should longitudinal analyses account for age-appropriate changes in test instruments, especially when different psychological constructs are measured (for example, when toddlers transition from measures of developmental status to measures of general cognitive ability)? Ideally, investigations into these questions will be conducted with children without medical or educational difficulties. Based on the findings of this study, analyses of longitudinal data should include baseline test scores, number of test-specific retests, and any environmental or disease-specific variables that may influence performance on measures of psychological abilities. Such variables might be incorporated into the research design either as exclusionary criteria or by statistical control. In these ways, future researchers may enhance the interpretation of longitudinal test results.
| Acknowledgments |
|---|
We are indebted to the children, adolescents, and parents who volunteered to participate in this study and to the members of the Hemophilia Treatment Centers. The study was supported by the Bureau of Maternal and Child Health and Resources Development (MCJ-060570), the National Institute of Child Health and Human Development (NO1-HD-4-3200), the Centers for Disease Control and Prevention, the Laboratory of Genomic Diversity of the National Cancer Institute, and the National Institute of Mental Health. Additional support was provided by grants from the National Center for Research Resources of the National Institutes of Health to the New York Hospital-Cornell Medical Center Clinical Research Center (MO1-RR06020), the Mount Sinai General Clinical Research Center, New York (MO1-RR00071), the University of Iowa Clinical Research Center (MO1-RR00059), and the University of Texas Health Science Center, Houston (MO1-RR02558). The following individuals are the Center Directors, Study Coordinators, or Committee Chairs of the study: Childrens Hospital Los AngelesE. Gomperts, MD, W.-Y. Wong, MD, F. Kaufman, MD, M. Nelson, MD, S. Pearson, RN; The New York Hospital- Cornell Medical CenterM. Hilgartner, MD, S. Cunningham-Rundles, PhD, I. Goldberg, RN; University of Texas Medical School, HoustonW. K. Hoots, MD, K. Loveland, PhD, M. Cantini, RN; National Institutes of Health, National Institute of Child Health and Human DevelopmentA. Willoughby, MD, MPH, Robert Nugent, PhD; New England Research Institutes, Inc.S. McKinlay, PhD; Rho, Inc.S. Donfield, PhD; Baylor College of MedicineC. Contant, Jr., PhD; University of Iowa Hospitals and Clinics C. T. Kisker, MD, J. Stehbens, PhD, S. O'Conner, J. McKillip, RN; Tulane University Health Sciences CenterP. Sirois, PhD; Children's Hospital of OklahomaC. Sexauer, MD, H. Huszti, PhD, F. Kiplinger, S. Hawk, P.A.-C.; Mount Sinai Medical CenterS. Arkin, MD, A. Forster, RN; University of Nebraska Medical CenterS. Swindells, MD, S. Richard; University of Texas Health Science Center, San AntonioJ. Mangos, MD, A. Scott, PhD, R. Davis, RN; Children's Hospital of MichiganJ. Lusher, MD, I. Warrier, MD, K. Baird-Cox, RN, MSN; Milton S. Hershey Medical CenterM. E. Eyster, MD, D. Ungar, MD, S. Neagley, RN, MA; Indiana Hemophilia and Thrombosis CenterA. Shapiro, MD, J. Morris, PNP; University of California-San Diego Medical CenterG. Davignon, MD, P. Mollen, RN; Kansas City School of Medicine, Children's Mercy HospitalB. Wicklund, MD, A. Mehrhof, RN, MSN.
Received April 3, 2000; revision received February 16, 2001; accepted July 5, 2001
| References |
|---|
|
|
|---|
Bale, J. F. Jr., Contant, C. F., Garg, B., Tilton, A., Kaufman, D. M., & Wasiewski, W. (1993). Neurologic history and examination results and their relationship to human immunodeficiency virus type 1 serostatus in hemophilic subjects: Results from the Hemophilia Growth and Development Study. Pediatrics, 91, 736-741.
Canivez, G. L., & Watkins, M. W. (1998). Long-term stability of the Wechsler Intelligence Scale for Children-Third Edition. Psychological Assessment, 10, 285-291.
Hilgartner, M. W., Donfield, S. M., Willoughby, A., Contant, C. F. Jr., Evatt, B. L., Gomperts, E. D., Hoots, W. K., Jason, J., Loveland, K. A., McKinlay, S. M., & Stehbens, J. A. (1993). Hemophilia Growth and Development Study: Design, methods, and entry data. American Journal of Pediatric Hematology/Oncology, 15, 208-218.[ISI][Medline]
Horton, A. M. Jr. (1992). Neuropsychological practice effects x age: A brief note. Perceptual and Motor Skills, 75, 257-258.[ISI][Medline]
Kaufman, A. S. (1979). Intelligent testing with the WISC-R. New York: Wiley-Interscience.
Loveland, K. A., Stehbens, J., Contant, C., Bordeaux, J. D.,
Sirois, P., Bell, T. S., Hill, S., Scott, A., Bowman, M., Schiller, M.,
Watkins, J., Olson, R., Moylan, P., Cool, V., & Belden, B.
(1994). Hemophilia Growth and Development Study: Baseline
neurodevelopmental findings. Journal of Pediatric
Psychology, 19,
223-239.
Loveland, K. A., Stehbens, J. A., Mahoney, E. M., Sirois, P. A.,
Nichols, S., Bordeaux, J. D., Watkins, J. M., Amodei, N., Hill, S. D.,
Donfield, S., & the Hemophilia Growth and Development Study
(2000). Declining immune function in children and adolescents
with hemophilia and HIV infection: Effects on neuropsychological performance.
Journal of Pediatric Psychology,
25, 309-322.
Mitchell, W. G., Lynn, H., Bale, J. F. Jr., Maeder, M. A.,
Donfield, S. M., Garg, B., Tilton, A. H., Willis, J. K., & Bohan, T. P.
(1997). Longitudinal neurological follow-up of a group of
HIV-seropositive and HIV-seronegative hemophiliacs: Results from the
Hemophilia Growth and Development Study. Pediatrics,
100, 817-824.
Mitchell, W. G., Nelson, M. D., Contant, C. F., Bale, J. F. Jr.,
Wilson, D. A., Bohan, T. P., & Fenstermacher, M. J. (1993).
Effects of human immunodeficiency virus and immune status on magnetic
resonance imaging of the brain in hemophilic subjects: Results from the
Hemophilia Growth and Development Study. Pediatrics,
91, 742-746.
Mitrushina, M., & Satz, P. (1991). Effect of repeated administration of a neuropsychological battery in the elderly. Journal of Clinical Psychology, 47, 790-801.[ISI][Medline]
Moffitt, T. E., Caspi, A., Harkness, A. R., & Silva, P. A. (1993). The natural history of change in intellectual performance: Who changes? How much? Is it meaningful? Journal of Child Psychology and Psychiatry, 34, 455-506.[ISI][Medline]
Mulhern, R. K., Ochs, J., & Fairclough, D. (1992). Deterioration of intellect among children surviving leukemia: IQ test changes modify estimates of treatment toxicity. Journal of Consulting and Clinical Psychology, 60, 477-480.[ISI][Medline]
Nelson, M. D. Jr., Maeder, M. A., Usner, D., Mitchell, W. G., Fenstermacher, M. J., Wilson, D. A., Gomperts, E. D., & the Hemophilia Growth and Development Study (1999). Prevalence and incidence of intracranial haemorrhage in a population of children with haemophilia. Haemophilia, 5, 306-312.[ISI][Medline]
Neyens, L. G. J., & Aldenkamp, A. P. (1997). Stability of cognitive measures in children of average ability. Child Neuropsychology, 3, 161-170.
Rasbury, W., Mc Coy, J. G., & Perry, N. W. Jr. (1977). Relations of scores on WPPSI and WISC-R at a one-year interval. Perceptual and Motor Skills, 44, 695-698.
Schuerger, J. M., & Witt, A. C. (1989). The temporal stability of individually tested intelligence. Journal of Clinical Psychology, 45, 294-302.
Shapiro, A. D., Donfield, S. M., Lynn, H. S., Bray, G. L., Cool, V. A., Hunsberger, S. L., Stehbens, J. A., & Gomperts, E. D. (2000, December). Academic achievement in children with severe hemophilia A. Poster session presented at the annual meeting of the American Society of Hematology, San Francisco, CA.
Sirois, P. A., & Hill, S. D. (1993). Developmental change associated with human immunodeficiency virus infection in school-age children with hemophilia. Developmental Neuropsychology, 9, 177-197.
Sirois, P. A., Usner, D. W., Hill, S. D., Mitchell, W. G., Bale, J.
F. Jr., Loveland, K. A., Stehbens, J. A., Donfield, S. M., Maeder, M. A.,
Amodei, N., Contant, C. F. Jr., Nelson, M. D. Jr., Willis, J. K., & the
Hemophilia Growth and Development Study (1998). Hemophilia Growth
and Development Study: Relationships between neuropsychological, neurological,
and MRI findings at baseline. Journal of Pediatric
Psychology, 23,
45-56.
Stehbens, J. A., Loveland, K. A., Bordeaux, J. D., Contant, C., Schiller, M., Scott, A., Moylan, P. M., & Maeder, M. (1997). A collaborative model for research: Neurodevelopmental effects of HIV-1 in children and adolescents with hemophilia as an example. Children's Health Care, 26, 115-135.
Terman, L. M., & Merrill, M. A. (1937). Measuring intelligence. Boston: Houghton Mifflin.
Terman, L. M., & Merrill, M. A. (1960). Stanford-Binet Intelligence Scale: Manual for the third revision, Form L-M. Boston: Houghton-Mifflin.
Tuma, J. M., & Appelbaum, A. S. (1980). Reliability and practice effects of WISC-R IQ estimates in a normal population. Educational and Psychological Measurement, 40, 671-678.[Abstract]
Usner, D., & Fitzgerald, G. (1999). Analytical implications of changing neuropsychological test versions during a longitudinal study due to aging of a pediatric cohort [Letter to the editor]. Controlled Clinical Trials, 20, 476-478.[ISI][Medline]
Vance, B., Hankins, N., & Brown, W. (1987). A longitudinal study of the Wechsler Intelligence Scale for Children-Revised over a six-year period. Psychology in the Schools, 24, 229-233.
Wechsler, D. (1949). Manual for the Wechsler Intelligence Scale for Children. San Antonio: Psychological Corporation.
Wechsler, D. (1955). Manual for the Wecshler Adult Intelligence Scale. San Antonio: Psychological Corporation.
Wechsler, D. (1974). Manual for the Wechsler Intelligence Scale for Children-Revised. San Antonio: Psychological Corporation.
Wechsler, D. (1981). Wechsler Adult Intelligence Scale-Revised. San Antonio: Psychological Corporation.
Wechsler, D. (1991). Manual for the Wechsler Intelligence Scale for Children-Third Edition. San Antonio: Psychological Corporation.
Whitt, J. K., Hooper, S. R., Tennison, M. B., Robertson, W. T., Gold, S. H., Burchinal, M., Wells, R., Campbell, M., Whaley, R. A., Combest, J., & Hall, C. D. (1993). Neuropsychologic functioning of human immunodeficiency virus-infected children with hemophilia. Journal of Pediatrics, 122, 52-59.[ISI][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Frangou, M. Hadjulis, and A. Vourdas The Maudsley Early Onset Schizophrenia Study: Cognitive Function Over a 4-Year Follow-Up Period Schizophr Bull, January 1, 2008; 34(1): 52 - 59. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Smith, K. Malee, R. Leighty, P. Brouwers, C. Mellins, J. Hittelman, C. Chase, I. Blasini, and for the Women and Infants Transmission Study Group Effects of Perinatal HIV Infection and Associated Risk Factors on Cognitive Development Among Young Children Pediatrics, March 1, 2006; 117(3): 851 - 862. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. T. Tamula, P. L. Wolters, C. Walsek, S. Zeichner, and L. Civitello Cognitive Decline With Immunologic and Virologic Stability in Four Children With Human Immunodeficiency Virus Disease Pediatrics, September 1, 2003; 112(3): 679 - 684. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

