Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Sirois, P. A.
Right arrow Articles by Amodei, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sirois, P. A.
Right arrow Articles by Amodei, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Journal of Pediatric Psychology, Vol. 27, No. 2, 2002, pp. 121-131
© 2002 Society of Pediatric Psychology

Quantifying Practice Effects in Longitudinal Research With the WISC-R and WAIS-R: A Study of Children and Adolescents With Hemophilia and Male Siblings Without Hemophilia

Patricia A. Sirois, PhD1, Michael Posner, MS2, James A. Stehbens, PhD3, Katherine A. Loveland, PhD4, Sharon Nichols, PhD5, Sharyne M. Donfield, PhD6, Terece S. Bell, PhD7, Suzanne D. Hill, PhD8 and Nancy Amodei, PhD9 the Hemophilia Growth and Development Study

1 Tulane University Health Sciences Center, 2 Boston University School of Public Health, 3 University of Iowa College of Medicine, Iowa City, 4 University of Texas Medical School, Houston, 5 University of California, San Diego, 6 Rho, Inc., Chapel Hill, North Carolina, 7 Childrens Hospital Los Angeles, 8 Williamsburg, Virginia, 9 University of Texas, San Antonio

All correspondence should be sent to Patricia A. Sirois, Department of Pediatrics, Tulane University Health Sciences Center, 1430 Tulane Avenue (TW-41), New Orleans, Louisiana 70112. E-mail: psirois{at}tulane.edu .


    Abstract
 Top
 Abstract
 Introduction
 Method
 Results
 Discussion
 References
 
Objective: To quantify practice effects associated with annual administrations of WISC-R and WAIS-R in children and adolescents with and without hemophilia.

Methods: Participants were young men (age: 7-19; 80 with hemophilia, 30 siblings) enrolled in the Hemophilia Growth and Development Study. Participants with hemophilia completed age-appropriate Wechsler scales at baseline and at four annual follow-ups; the siblings, at baseline and one 2-year follow-up. Regression analyses were used to quantify average changes in scores, adjusting for variables related to test performance.

Results: Consecutive annual evaluations were free of significant practice effects for 4 years with the Verbal Scale and for 2 years with the Performance Scale. VIQ decreased, and PIQ increased over time. Baseline VIQ was related to changes in VIQ; baseline PIQ and number of test-specific retests were related to changes in PIQ.

Conclusions: The findings support use of Wechsler scales for annual evaluations to monitor cognitive development in children and adolescents.

Key words: longitudinal assessment; retest effects; WISC-R; WAIS-R; chronic illness; sibling comparisons.


    Introduction
 Top
 Abstract
 Introduction
 Method
 Results
 Discussion
 References
 
Repeated assessments of cognitive abilities in children and adolescents are increasingly common in research and applied settings, particularly in clinical trials of therapies to treat conditions such as cancer and HIV disease. Children and adolescents may demonstrate improvements in performance on successive administrations of psychological tests due to previous experience with the test materials rather than to improvements in cognitive performance. Thus, standard scores that do not increase over time may not necessarily reflect cognitive stability because the gain due to previous experience may compensate for a slowing of cognitive growth. Reports of changes in mean scores following repeated administrations of measures of general intelligence are available in the literature, but the data are difficult to integrate because of differences in retest interval, statistical methods, and characteristics of study participants. Reports of retest effects associated with successive annual assessments are rare. Regardless of the purpose of the assessments, knowledge of the relative contribution of retest effects is important to an understanding of the meaning of changes in test performance over time.

Research on test reliability in populations of children and adolescents is based largely on data obtained in school settings, where periodic assessments are employed in educational planning and placement decisions. Tuma and Appelbaum (1980Go) were the first to report data describing practice effects on the Wechsler Intelligence Scale for Children-Revised (WISC-R; Wechsler, 1974Go). Using a 6-month retest interval, the authors reported significant gains in Full Scale IQ (FSIQ; 4.73 points) and Performance Scale IQ (PIQ; 7.82 points) but no significant gain in Verbal Scale IQ (VIQ; 1.09 points). These changes were smaller than those reported in the test manual for a 1-month retest interval (7, 9.5, and 3.5 points, respectively; Wechsler, 1974Go, p. 31), indicating that the reliability of the WISC-R is inversely related to the length of the retest interval. A study of a large birth cohort showed that FSIQ increased 5.3 points over 7 years for children who completed the WISC-R every 2 years between the ages of 7 and 13 (Moffitt, Caspi, Harkness, & Silva, 1993Go).

Schuerger and Witt (1989Go) reviewed textbooks, journal articles, and test manuals and found 34 studies of test-retest reliabilities on the WISC and WISC-R (Wechsler, 1949Go, 1974Go), the Wechsler Adult Intelligence Scale (WAIS and WAIS-R; Wechsler 1955Go, 1981Go), and the Stanford-Binet Intelligence Scale, excluding the fourth edition (Terman & Merrill, 1937Go; 1960Go). Only 5 of the 34 studies included a retest interval of approximately 1 year (range = 10-20 months), only 6 included the WISC-R, and none of the studies of the WAIS-R included participants under the age of 20. Multiple regression procedures were employed; the dependent variable was the average test-retest correlation in each study, and the independent variables were characteristics of the studies, including sample size, retest interval, and age and gender of participants. The results indicated that the tests were highly reliable (mean r =.82) and that reliability increased with older age at first testing (r =.70-.96) and decreased with increasing length of retest interval (r =.89-.64).

Horton (1992Go) presented a brief review of the literature regarding practice effects on the WAIS-R. He noted that improvements in IQ scores are inversely related to the length of the retest interval (consistent with previous studies of the WISC-R) and inversely related to age, with younger individuals showing larger gains than older individuals. In a study of healthy adults (ages 57-85) who completed a short form of the WAIS-R annually for 2 years, Mitrushina and Satz (1991Go) found small decreases in Verbal Scale scores at all ages, increases in Performance Scale scores for adults 65 and younger, and either stable or declining Performance scores for those older than 65.

We found only four reports addressing the issue of transition between Wechsler test instruments, and only one of these concerned transition between the WISC-R and WAIS-R (Usner & Fitzgerald, 1999Go). The authors examined the data of 69 adolescents who aged into the WAIS-R during participation in the Hemophilia Growth and Development Study (HGDS). They found significant changes in FSIQ and PIQ but not in VIQ at the time of transition to the first WAIS-R (FSIQ, -3.77 points; PIQ, -7.35 points; and VIQ, +0.7 points). All WAIS-R subtest scores, except Digit Span, decreased significantly (range = -0.7 to -3.4 points). These results are consistent with reports of directional trends in IQ scores at the time of transition from the WPPSI to the WISC-R (Mulhern, Ochs, & Fairclough, 1992Go; Neyens & Aldenkamp, 1997Go) and consistent with Rasbury, Mc Coy, and Perry's (1977Go) finding of decreases in all WISC-R subtest scores compared to the WPPSI. Usner and Fitzgerald (1999Go) attributed their findings to loss of benefit derived from increasing familiarity with the items of the Performance Scale of the WISC-R and to differences between the WISC-R and WAIS-R in the scaling of age norms used to calculate the subtest scaled scores.

This study is a secondary analysis of data from the HGDS, a longitudinal study of male children and adolescents designed to examine changes in psychological development, physical growth, and neurological functioning (Hilgartner et al., 1993Go). The HGDS included individuals with hemophilia and HIV as well as those with hemophilia but without HIV (HIV —). At baseline, the HIV — group was within the average range of general intelligence (Loveland et al., 1994Go), consistent with other studies of children with hemophilia (Sirois & Hill, 1993Go; Whitt et al., 1993Go). Their MRI and EEG results were normal (Mitchell et al., 1993Go; Sirois et al., 1998Go). Their abnormalities on the neurological examinations at baseline and during 4 years of follow-up included muscle atrophy and coordination and gait disturbances; these findings were not associated with histories of seizure, head trauma, or intracranial bleeding but were the result of hemophilia-related joint disease (Bale et al., 1993Go; Mitchell et al., 1997Go). Thus, the HIV — cohort of the HGDS provides an opportunity to explore longitudinal changes in psychological test performance in children and adolescents with a chronic illness (hemophilia) but without a neurological condition that might cause deterioration of intellectual abilities. Based on data from previous reports, we proposed four hypotheses:

  1. VIQ will remain relatively stable, and PIQ will increase significantly over time.
  2. Retest effects will be related to baseline IQ (Rasbury et al., 1977Go) and to age at baseline (Horton, 1992Go; Mitrushina & Satz, 1991Go; Schuerger & Witt, 1989Go).
  3. Increases in PIQ will be related to increases in scaled scores on particular subtests of the Performance Scale, for example, Picture Arrangement (Tuma & Appelbaum, 1980Go) and Object Assembly (Kaufman, 1979Go).
  4. At the time of transition from WISC-R to WAIS-R, there will be minimal change in VIQ, and PIQ will decrease (Neyens & Aldenkamp, 1997Go; Usner & Fitzgerald, 1999Go). Declines in PIQ will be greater for adolescents with more frequent exposure to the WISC-R before completing their first WAIS-R.


    Method
 Top
 Abstract
 Introduction
 Method
 Results
 Discussion
 References
 
Participants
Participants in the HGDS included male children and adolescents with hemophilia (n = 333) and biological brothers (full or half-brothers; n = 47) living in the same household as those with hemophilia. They were recruited from 14 hemophilia treatment centers in the United States. Institutional review board (IRB) approval for protection of human subjects was obtained at all centers, and written informed consent and assent were obtained from each participant, parent, or legal guardian, according to local IRB requirements. Details of the eligibility criteria, design, methods, data management, and quality control procedures are presented in Hilgartner et al. (1993Go) and Stehbens et al. (1997Go).

For this analysis, 80 participants with hemophilia and 30 male siblings without hemophilia were selected for study. The siblings were included as a comparison group because hemophilia could have subtle adverse effects on measures of cognitive performance (Loveland et al., 2000Go; Sirois et al., 1998Go; Whitt et al., 1993Go). Individuals with hemophilia and HIV (n = 207) were excluded to avoid the potentially confounding influence of advancing HIV disease on longitudinal test performance (Loveland et al., 2000Go). Participants with invalid (defined as validity rating > 1; see Procedures) or missing Wechsler data at any time of measurement were excluded to maintain uniformity of the retest interval and number of retests. Of those with hemophilia, 1 was excluded for invalid data, 33 for missing data, and 12 for both reasons. Of the siblings, 11 were excluded for invalid data and 6 for missing data. At baseline, the participants with hemophilia who were included in the study were similar in VIQ (M = 106.0 vs. 104.1, t = 0.617, p =.54) but achieved higher PIQ scores (M = 108.5 vs. 102.2, t = 2.238, p =.03) than those who were excluded.

Participant characteristics relevant to this study are shown in Table I. Parental education differs between the groups because the siblings were drawn from a subset of all families enrolled in the HGDS. A history of academic problems was defined as having repeated a grade in school or current placement in a remedial special education program. Medical personnel in hemophilia treatment centers instruct parents to obtain prompt treatment for any injury to the child's head to avoid the potential consequences of intracranial bleeding. In the HGDS, possible head trauma was recorded if parents reported either that their son saw a physician or infused blood-clotting factor because of an injury to the head. Nelson et al. (1999Go) studied the incidence and prevalence of intracranial hemorrhage in the HGDS sample. At baseline, 38% of participants reported a history of head trauma, but only 5% of these showed evidence of intracranial bleeding on the MRI. No incidence of reported head trauma during the 4-year follow-up resulted in skull fracture or required surgery. The authors thus concluded that on-demand treatment with blood-clotting factor was sufficient to limit intracranial bleeding and that reported head injuries were not severe. In this analysis, among participants with hemophilia, those with a history of possible head trauma at baseline and those without such history were similar and within age expectations in VIQ (M = 103.5 vs. 109.1, respectively, p =.15) and PIQ (M = 106.0 vs. 111.1, p =.13). This result, combined with Nelson et al.'s findings, supports the view that, in this study, reported head trauma in participants with hemophilia reflects parents' efforts to treat quickly any injury that might lead to intracranial bleeding, not that the rate of head trauma is higher than in the siblings without hemophilia. Nevertheless, because of the potential consequences of head trauma in children with hemophilia and because possible head trauma was associated with lowered neuropsychological performance in a previous analysis of baseline data from the HGDS (Sirois et al., 1998Go), we chose to control for this variable in the statistical analyses.


View this table:
[in this window]
[in a new window]
 
Table I. Characteristics of Study Participants
 

Procedures
Parents, legal guardians, or older participants were interviewed at baseline to obtain developmental and educational histories and parents' level of formal education. The Wechsler scales were administered as part of a comprehensive neuropsychological battery (Stehbens et al., 1997Go). The WISC-R was employed for participants ages 6-16 and the WAIS-R for those 17 and older. The data for the cohort with hemophilia were collected at baseline and 1-year intervals for 4 years, providing five successive annual assessments. The data for the sibling cohort were collected at baseline and one 2-year follow-up, providing two assessments over a 2-year interval. The examinations were performed by psychologists or psychological examiners experienced in conducting standardized assessments with children and adolescents. At the end of each session, examiners rated the validity of the data on a scale of 1 (valid) to 4 (invalid). Only those data rated 1 (valid) were included in this analysis.

Statistical Methods
The Wechsler VIQ, PIQ, and subtest scaled scores were included in the analyses. Age-scaled subtest scores were used in the analyses of WAIS-R data. The Mazes subtest of the WISC-R was excluded because there is no subtest analogous to it in the WAIS-R. Descriptive statistics and simple correlations (Pearson r) were used to summarize the data. Paired t tests were used to compare each of the baseline and annual follow-up scores with each other. When testing an increase or decrease, as indicated in the text, we used a one-sided test; when testing a difference, we used a two-sided test.

The following variables, relevant to the hypotheses of the study, were included as potential covariates: baseline scores (VIQ, PIQ, and subtest scaled scores), age at baseline, test administered (WISC-R vs. WAIS-R), number of test-specific retests for each test version, and number of WISC-Rs completed before transition to the WAIS-R. In addition, the following variables, selected from earlier reports of HGDS data (Loveland et al., 2000Go; Sirois et al., 1998Go), were included to control for their potential influence on test performance: parent's level of formal education (father's if present, otherwise mother's education), history of academic problems at baseline, history of possible head trauma at baseline, and possible head trauma while on study. Test taking before enrollment into the HGDS was not included in the analyses.

Regression analysis was used to quantify the average change in test scores with each repeated administration and to determine whether retest performance might be predicted from baseline values of the selected covariates. We used step-wise and all-subsets methods to determine the significant covariates. The step-wise analysis used forward selection to determine which variables (one at a time) should be excluded from the model. The regression examined all possible combinations of covariates that could be included in the final model. Interactions between covariates were evaluated to determine whether any variable affected the retest effect differently for subgroups of other variables. Transformations (logarithmic, quadratic, and square-root) of the independent variables (i.e., nonlinear terms) were included in linear analyses and compared to the linear model to determine whether any nonlinear relationships existed in the data. Results with p <=.05 were considered significant. When correction for multiple comparisons was relevant, we used the Bonferroni method. All analyses were conducted using SAS version 6.12 (SAS Institutes, Inc.) or S-PLUS version 3.3 or 4.0 (Mathsoft, Inc.).


    Results
 Top
 Abstract
 Introduction
 Method
 Results
 Discussion
 References
 
Cohort With Hemophilia
Covariates. Baseline VIQ was the only covariate significantly related to changes in VIQ over time. Baseline PIQ and number of test-specific retests were the only two covariates significantly related to changes in PIQ. The variables of baseline age, parental education, history of academic problems, and possible head trauma before or during study were not significantly related to longitudinal changes in test performance. The issue of head trauma in patients with hemophilia is important because of the potential for adverse intellectual outcomes, but our data do not confirm such outcomes as related to parental reports of possible head trauma. There were no significant interactions between any of the covariates in the analysis; therefore, the covariate effects may be interpreted directly. We tested the interaction between baseline IQ and practice effect to determine whether repeated testing for individuals with different baseline IQ scores produced different practice effects. The model was not significant, indicating that the effect of repeated testing was relatively equivalent across the range of baseline IQ scores in this study (Table I).

VIQ and PIQ. The unadjusted means, standard deviations, and standard errors of measurement of the VIQ and PIQ scores at baseline and each annual follow-up are shown in Table II. VIQ decreased slightly, and PIQ increased steadily during the 4 years of study. The longitudinal changes in VIQ and PIQ from one time of measurement to another are presented in Table III; p values are shown for the paired t tests, and Pearson correlations (r) are provided to show the stability of the measures from year to year. The scores were highly correlated during the four years of study. Both VIQ and PIQ were stable (r =.78-.91 and.72-.88, respectively), and the stability of both scores decreased with increasing time since baseline (r =.82-.78 and.83-.72, respectively). The magnitudes of change in VIQ and PIQ at the end of the second year (-1.94 and 4.8 points, respectively) were less than or comparable to the average standard errors of measurement reported for the WISC-R after a 1-month retest interval (3.6 and 4.66 points, respectively; Wechsler, 1974Go, p. 30) and for the WAIS-R after an interval of 1-7 weeks (2.74 and 4.14 points, respectively; Wechsler, 1981Go, p. 33).


View this table:
[in this window]
[in a new window]
 
Table II. Cohort With Hemophilia: Wechsler IQ Scores Obtained at Baseline and Each Annual Follow-Up
 

View this table:
[in this window]
[in a new window]
 
Table III. Cohort With Hemophilia: Changes in Wechsler IQ Scores Over Four Years of Study
 

As shown in Table III, the only significant difference in mean VIQ was between the scores obtained at baseline and the fourth annual follow-up (-2.63 points, p =.034). PIQ increased significantly over baseline beginning with the second annual follow-up (4.8 points, p =.0001), but the changes were not significant between each successive annual evaluation. Instead, the increases in PIQ were significant only between the scores obtained at annual 1 versus 2, annual 1 versus 3, annual 1 versus 4, and annual 2 versus 4. This pattern of results suggests a cumulative effect of experience such that three exposures over 2 years were required to produce a significant improvement in PIQ. The finding of a significant difference in PIQ between annual 1 versus 2 but not between annual 2 versus 3 or annual 3 versus 4 suggests a declining benefit from repeated experience with the measures. Although a square-root model provided some theoretical appeal in this context, it did not provide significantly different results from the linear model we used; thus, the simpler (linear) model was chosen. This conclusion was based on testing the difference in the significance of the models as well as examining residual plots. If, however, one were to expand the analysis to a period longer than 4 years, we recommend that a nonlinear model be considered to capture this effect.

Subtest Analysis. Table IV shows the unadjusted means and standard deviations for the Verbal and Performance Scale subtest scores (WISC-R and WAIS-R combined). The Vocabulary score declined significantly over time (average change = -0.23 points), but the changes in the remaining Verbal Scale subtests were not significant. All of the scores on the Performance Scale subtests increased significantly (average change = 0.21-0.44 points).


View this table:
[in this window]
[in a new window]
 
Table IV. Cohort With Hemophilia: Unadjusted Wechsler Subtest Scores Obtained at Baseline and Each Annual Follow-Up
 

Transition from WISC-R to WAIS-R. Fourteen (n = 14) adolescents with hemophilia reached their seventeenth birthday during the study. Their mean VIQ scores increased from 103.3 on the last WISC-R to 107.5 on the first WAIS-R (SD = 18.6 vs. 19.0, SEM = 5.2 vs. 5.3, respectively). Their mean PIQ scores declined from 115.8 on the last WISC-R to 109.8 on the first WAIS-R (SD = 17.0 vs. 16.2, SEM = 4.7 vs. 4.5, respectively). The changes in VIQ (4.2 points) and PIQ (-6.0 points) were not significant. The number of times a participant completed the WISC-R was not related to his performance on the first WAIS-R.

Final Models: WISC-R and WAIS-R Combined. The final models, using combined WISC-R and WAIS-R IQ scores at baseline and each annual follow-up, were: VIQx = 23.63 + 0.76(VIQ0), and PIQx = 13.86 + 0.87(PIQ0) + 2.37(rep) (p <.00005 for the "rep" variable), where IQx is the IQ at annual follow-up x, IQ0 is the IQ at baseline, and rep is the number of repeated administrations. Although WISC-R and WAIS-R scores were combined in these models, the test versions were treated separately with respect to repeated administrations to determine the effect of test-specific retesting. Thus, for a participant who aged into the WAIS-R, rep = 0 for the first administration of the WAIS-R. The models indicate that VIQ at retest was not significantly related to the number of repeated assessments, but PIQ increased by 2.37 points, on average, with each annual evaluation. The R2 for both VIQ and PIQ was.64, indicating that 64% of the variance in retest VIQ was accounted for by baseline VIQ, and 64% of the variance in retest PIQ was explained by baseline PIQ and the number of test-specific retests.

Final Models: WISC-R and WAIS-R Separately. Using the WISC-R only (283 observations), the final models were VIQx = 23.19 + 0.75(VIQ0), and PIQx = 14.65 + 0.86(PIQ0) + 2.31(rep) (p <.00005 for the rep variable). Using the WAIS-R only (37 observations), the final models were VIQx = 15.12 + 0.85(VIQ0), and PIQx = 6.69 + 0.93(PIQ0) + 3.25(rep) (p =.047 for the rep variable). Thus, there were no significant changes in VIQ following repeated annual assessments with either test, but PIQ increased, on average, by 2.31 points for every annual WISC-R and 3.25 points for every annual WAIS-R.

Comparison With Sibling Cohort
VIQ and PIQ. The siblings demonstrated a nonsignificant decrease of 0.07 points in VIQ after the 2-year retest interval (M = 104.37 ± 15.61 at baseline vs. 104.30 ± 15.89 at retest, p =.49). This change was not significantly different from the decrease of 2.21 points in VIQ for the group with hemophilia at their first annual retest (p =.48). There was a significant increase of 3.8 points in the siblings' PIQ at the 2-year follow-up (M = 105.63 ± 15.01 at baseline vs. 109.43 ± 16.28 at retest, p =.023), but this increase was not significantly different from the increase of 1.66 points observed in the group with hemophilia at their first annual retest (p =.46). While there is insufficient statistical power to verify whether children and adolescents with hemophilia and their male siblings without hemophilia have equivalent retest effects, these results provide reassurance of consistent trends for VIQ and PIQ with repeated testing.

Subtest Analysis. The siblings demonstrated no significant differences in performance on the Verbal Scale subtests following the 2-year retest interval. The only significant change among the Performance Scale subtests was an increase of 1.24 points on the Object Assembly task (M = 10.33 ± 3.07 at baseline vs. 11.57 ± 3.82 at retest, p =.034).

Final Models: WISC-R and WAIS-R Combined. The final models for the sibling cohort, using combined WISC-R and WAIS-R scores at baseline and the 2-year follow-up, were VIQ = 6.93 + 0.93(VIQ0) and PIQ = 17.94 + 0.87(PIQ0). The R2 for VIQ was 0.84 and for PIQ was 0.64. These results indicate that, for the siblings, baseline VIQ accounted for 84% of the variance in retest VIQ, and baseline PIQ accounted for 64% of the variance in retest PIQ.


    Discussion
 Top
 Abstract
 Introduction
 Method
 Results
 Discussion
 References
 
The findings of this study are consistent with previous reports that Wechsler VIQ and PIQ scores are highly reliable over time and that the reliabilities of both VIQ and PIQ are inversely related to the length of the retest interval. The results support hypothesis 1 that VIQ remains relatively unchanged while PIQ increases with successive annual assessments. Although VIQ decreased in the cohort with hemophilia, the decline was not statistically significant until the fourth annual follow-up, that is, after participants had completed the Verbal Scale on five occasions over a 4-year period. PIQ increased with each annual evaluation, but the improvements did not reach statistical significance until the second annual follow-up, that is, after participants had completed the Performance Scale on three occasions over a 2-year period. PIQ continued to increase during the next 2 years but less than in the first 2 years. The gains were not significant between the second and third years or between the third and fourth years. These findings suggest that there is a declining benefit from repeated experience with the test materials. The sibling group demonstrated a nonsignificant decrease in VIQ and a significant increase in PIQ after a 2-year retest interval. The magnitudes of change were not significantly different from the changes in the group with hemophilia after a 1-year interval, providing reassurance of consistent trends in VIQ and PIQ with repeated testing.

Hypothesis 2 is partially supported by the data in that longitudinal changes in VIQ and PIQ were related to participants' IQ scores at baseline. The data do not support the hypothesis that retest performance is related to age at baseline for school-age children and adolescents. The discrepancy is likely due to differences in the populations studied (older adults in the Mitrushina & Satz, 1991Go, study) and to the small number of studies of retest effects on the WISC-R and lack of data about adolescents' performance on repeated administrations of the WAIS-R (Schuerger & Witt, 1989Go). Differences in the age ranges of norms for children versus adults may also affect interpretation of test results obtained from cross-sectional and longitudinal studies. For example, 3-month intervals are used in the calculation of IQ scores on the WISC-R versus 2- to 10-year intervals for IQs on the WAIS-R.

The results of the subtest analyses support hypothesis 3. For the cohort with hemophilia, significant improvements occurred in all of the Performance Scale subtest scores, and the siblings showed significant gains on the Object Assembly task. We did not hypothesize a change in VIQ, but the data show that all but two of the Verbal Scale subtests (Information and Digit Span) declined over time. The decline in the Vocabulary subtest, though small, was significant between baseline and the end of the fourth annual evaluation, consistent with the decline in VIQ over the same period.

The data partially support hypothesis 4 concerning the expected effect of transition from one age-appropriate Wechsler scale to the other. For the subset who aged into the WAIS-R, we expected to find relatively equivalent VIQ and significantly lower PIQ scores on the first WAIS-R under the assumption that changes in the stimuli of the Performance Scale tasks would significantly reduce the benefit associated with familiarity with the items of the WISC-R. Instead, we found an increase in VIQ and a decrease in PIQ, but neither change was significant. The number of times the WISC-R was completed before introducing the WAIS-R was unrelated to performance on the first WAIS-R. Although we had limited statistical power to examine transition effects in the subset that aged into the WAIS-R, our findings that VIQ increased and PIQ decreased at the time of transition are consistent with data available from the four previous studies of test transition effects on Wechsler scales (Mulhern et al., 1992Go; Neyens & Aldenkamp, 1997Go; Rasbury et al., 1977Go; Usner & Fitzgerald, 1999Go).

The findings of this study provide support for the use of the Wechsler scales for annual evaluations to monitor cognitive development. More specifically, consecutive annual evaluations with the Verbal Scale for at least 4 years and with the Performance Scale for at least 2 years are essentially free of significant practice effects. By the end of the third annual evaluation, the increase in PIQ exceeded the average standard errors of measurement for the WISC-R and WAIS-R, and by the end of the fourth year, PIQ had increased by approximately 7.5 points. These results, combined with the lack of a transition effect between the WISC-R and WAIS-R, indicate that the annual improvements in PIQ after the first 2 years were influenced more by cumulative experience with the tasks than by random error. From these data, it seems likely that, for PIQ, the greater benefit of retesting lies not in the recollection of specific items on the Performance Scale but in the potential to learn strategies for success that are later generalized to new stimuli.

Two possibilities may account for the decline in VIQ observed in this study. Previous studies revealed that children and adolescents with hemophilia perform at age level on measures of general intellectual ability but below age expectations on measures of academic achievement and other cognitive abilities such as attention and visual processing (Loveland et al., 2000Go; Sirois & Hill, 1993Go; Whitt et al., 1993Go). The reasons for the finding of lower achievement are not known but may be related to school absenteeism. In a study of elementary school children (ages 6-12) with hemophilia, those with higher numbers of bleeding episodes missed more days of school and earned lower scores on measures of mathematics and overall achievement than those with fewer bleeding episodes (Shapiro et al., 2000Go). Performance on language-based tasks depends on prior learning and is influenced by formal education. If children with hemophilia have lower school achievement, they may not keep up with their peers in areas such as new vocabulary; thus, their scores on standardized measures of verbal ability might decline over time. Alternatively, language-based tasks may be less interesting to children and adolescents than the more perceptually oriented, hands-on problems of the Performance Scale. Children may perform at their best when verbal tasks are novel but become less motivated with succeeding presentations of the same items.

Although the group differences were not statistically significant, the siblings gained more in PIQ and lost less in VIQ on their first retest despite a longer retest interval than the group with hemophilia. Subtle difficulties in visual processing (Whitt et al., 1993Go) and acquisition of verbal skills (Loveland et al., 2000Go; Sirois & Hill, 1993Go) in the group with hemophilia may have contributed to the differences in PIQ and VIQ, respectively.

This study is limited by the single-sex sample, by the size of the sibling comparison group, by the small subset who aged into the WAIS-R, and by the range of IQs in the sample. These factors all reduce the generalizability of our findings. The fact that the data were obtained with the WISC-R presents a potential limitation, given that its successor, the WISC-III (Wechsler, 1991Go), is currently in standard use. A study of the stability of the WISC-III over a 3-year interval (Canivez & Watkins, 1998Go), however, indicated that the test-retest reliabilities were as high or higher than the WISC-R and there were no significant changes in IQ scores. These results are consistent with Vance, Hankins, and Brown's (1987Go) findings with respect to changes in the WISC-R after 3-year intervals and with the results obtained in this study. Thus, the pattern of findings in our study likely will be replicated in studies of retest effects on the WISC-III.

Several questions relevant to longitudinal research in pediatrics remain for future study. What are the effects of repeated testing with measures of infant developmental status? Do preschool-age children show the same patterns of change in test performance as older children and adolescents? How should longitudinal analyses account for age-appropriate changes in test instruments, especially when different psychological constructs are measured (for example, when toddlers transition from measures of developmental status to measures of general cognitive ability)? Ideally, investigations into these questions will be conducted with children without medical or educational difficulties. Based on the findings of this study, analyses of longitudinal data should include baseline test scores, number of test-specific retests, and any environmental or disease-specific variables that may influence performance on measures of psychological abilities. Such variables might be incorporated into the research design either as exclusionary criteria or by statistical control. In these ways, future researchers may enhance the interpretation of longitudinal test results.


    Acknowledgments
 
We are indebted to the children, adolescents, and parents who volunteered to participate in this study and to the members of the Hemophilia Treatment Centers. The study was supported by the Bureau of Maternal and Child Health and Resources Development (MCJ-060570), the National Institute of Child Health and Human Development (NO1-HD-4-3200), the Centers for Disease Control and Prevention, the Laboratory of Genomic Diversity of the National Cancer Institute, and the National Institute of Mental Health. Additional support was provided by grants from the National Center for Research Resources of the National Institutes of Health to the New York Hospital-Cornell Medical Center Clinical Research Center (MO1-RR06020), the Mount Sinai General Clinical Research Center, New York (MO1-RR00071), the University of Iowa Clinical Research Center (MO1-RR00059), and the University of Texas Health Science Center, Houston (MO1-RR02558). The following individuals are the Center Directors, Study Coordinators, or Committee Chairs of the study: Childrens Hospital Los Angeles—E. Gomperts, MD, W.-Y. Wong, MD, F. Kaufman, MD, M. Nelson, MD, S. Pearson, RN; The New York Hospital- Cornell Medical Center—M. Hilgartner, MD, S. Cunningham-Rundles, PhD, I. Goldberg, RN; University of Texas Medical School, Houston—W. K. Hoots, MD, K. Loveland, PhD, M. Cantini, RN; National Institutes of Health, National Institute of Child Health and Human Development—A. Willoughby, MD, MPH, Robert Nugent, PhD; New England Research Institutes, Inc.—S. McKinlay, PhD; Rho, Inc.—S. Donfield, PhD; Baylor College of Medicine—C. Contant, Jr., PhD; University of Iowa Hospitals and Clinics— C. T. Kisker, MD, J. Stehbens, PhD, S. O'Conner, J. McKillip, RN; Tulane University Health Sciences Center—P. Sirois, PhD; Children's Hospital of Oklahoma—C. Sexauer, MD, H. Huszti, PhD, F. Kiplinger, S. Hawk, P.A.-C.; Mount Sinai Medical Center—S. Arkin, MD, A. Forster, RN; University of Nebraska Medical Center—S. Swindells, MD, S. Richard; University of Texas Health Science Center, San Antonio—J. Mangos, MD, A. Scott, PhD, R. Davis, RN; Children's Hospital of Michigan—J. Lusher, MD, I. Warrier, MD, K. Baird-Cox, RN, MSN; Milton S. Hershey Medical Center—M. E. Eyster, MD, D. Ungar, MD, S. Neagley, RN, MA; Indiana Hemophilia and Thrombosis Center—A. Shapiro, MD, J. Morris, PNP; University of California-San Diego Medical Center—G. Davignon, MD, P. Mollen, RN; Kansas City School of Medicine, Children's Mercy Hospital—B. Wicklund, MD, A. Mehrhof, RN, MSN.

Received April 3, 2000; revision received February 16, 2001; accepted July 5, 2001


    References
 Top
 Abstract
 Introduction
 Method
 Results
 Discussion
 References
 
Bale, J. F. Jr., Contant, C. F., Garg, B., Tilton, A., Kaufman, D. M., & Wasiewski, W. (1993). Neurologic history and examination results and their relationship to human immunodeficiency virus type 1 serostatus in hemophilic subjects: Results from the Hemophilia Growth and Development Study. Pediatrics, 91, 736-741.[Abstract/Free Full Text]

Canivez, G. L., & Watkins, M. W. (1998). Long-term stability of the Wechsler Intelligence Scale for Children-Third Edition. Psychological Assessment, 10, 285-291.

Hilgartner, M. W., Donfield, S. M., Willoughby, A., Contant, C. F. Jr., Evatt, B. L., Gomperts, E. D., Hoots, W. K., Jason, J., Loveland, K. A., McKinlay, S. M., & Stehbens, J. A. (1993). Hemophilia Growth and Development Study: Design, methods, and entry data. American Journal of Pediatric Hematology/Oncology, 15, 208-218.[Web of Science][Medline]

Horton, A. M. Jr. (1992). Neuropsychological practice effects x age: A brief note. Perceptual and Motor Skills, 75, 257-258.[Web of Science][Medline]

Kaufman, A. S. (1979). Intelligent testing with the WISC-R. New York: Wiley-Interscience.

Loveland, K. A., Stehbens, J., Contant, C., Bordeaux, J. D., Sirois, P., Bell, T. S., Hill, S., Scott, A., Bowman, M., Schiller, M., Watkins, J., Olson, R., Moylan, P., Cool, V., & Belden, B. (1994). Hemophilia Growth and Development Study: Baseline neurodevelopmental findings. Journal of Pediatric Psychology, 19, 223-239.[Abstract/Free Full Text]

Loveland, K. A., Stehbens, J. A., Mahoney, E. M., Sirois, P. A., Nichols, S., Bordeaux, J. D., Watkins, J. M., Amodei, N., Hill, S. D., Donfield, S., & the Hemophilia Growth and Development Study (2000). Declining immune function in children and adolescents with hemophilia and HIV infection: Effects on neuropsychological performance. Journal of Pediatric Psychology, 25, 309-322.[Abstract/Free Full Text]

Mitchell, W. G., Lynn, H., Bale, J. F. Jr., Maeder, M. A., Donfield, S. M., Garg, B., Tilton, A. H., Willis, J. K., & Bohan, T. P. (1997). Longitudinal neurological follow-up of a group of HIV-seropositive and HIV-seronegative hemophiliacs: Results from the Hemophilia Growth and Development Study. Pediatrics, 100, 817-824.[Abstract/Free Full Text]

Mitchell, W. G., Nelson, M. D., Contant, C. F., Bale, J. F. Jr., Wilson, D. A., Bohan, T. P., & Fenstermacher, M. J. (1993). Effects of human immunodeficiency virus and immune status on magnetic resonance imaging of the brain in hemophilic subjects: Results from the Hemophilia Growth and Development Study. Pediatrics, 91, 742-746.[Abstract/Free Full Text]

Mitrushina, M., & Satz, P. (1991). Effect of repeated administration of a neuropsychological battery in the elderly. Journal of Clinical Psychology, 47, 790-801.[Web of Science][Medline]

Moffitt, T. E., Caspi, A., Harkness, A. R., & Silva, P. A. (1993). The natural history of change in intellectual performance: Who changes? How much? Is it meaningful? Journal of Child Psychology and Psychiatry, 34, 455-506.[Web of Science][Medline]

Mulhern, R. K., Ochs, J., & Fairclough, D. (1992). Deterioration of intellect among children surviving leukemia: IQ test changes modify estimates of treatment toxicity. Journal of Consulting and Clinical Psychology, 60, 477-480.[Web of Science][Medline]

Nelson, M. D. Jr., Maeder, M. A., Usner, D., Mitchell, W. G., Fenstermacher, M. J., Wilson, D. A., Gomperts, E. D., & the Hemophilia Growth and Development Study (1999). Prevalence and incidence of intracranial haemorrhage in a population of children with haemophilia. Haemophilia, 5, 306-312.[Web of Science][Medline]

Neyens, L. G. J., & Aldenkamp, A. P. (1997). Stability of cognitive measures in children of average ability. Child Neuropsychology, 3, 161-170.

Rasbury, W., Mc Coy, J. G., & Perry, N. W. Jr. (1977). Relations of scores on WPPSI and WISC-R at a one-year interval. Perceptual and Motor Skills, 44, 695-698.

Schuerger, J. M., & Witt, A. C. (1989). The temporal stability of individually tested intelligence. Journal of Clinical Psychology, 45, 294-302.

Shapiro, A. D., Donfield, S. M., Lynn, H. S., Bray, G. L., Cool, V. A., Hunsberger, S. L., Stehbens, J. A., & Gomperts, E. D. (2000, December). Academic achievement in children with severe hemophilia A. Poster session presented at the annual meeting of the American Society of Hematology, San Francisco, CA.

Sirois, P. A., & Hill, S. D. (1993). Developmental change associated with human immunodeficiency virus infection in school-age children with hemophilia. Developmental Neuropsychology, 9, 177-197.

Sirois, P. A., Usner, D. W., Hill, S. D., Mitchell, W. G., Bale, J. F. Jr., Loveland, K. A., Stehbens, J. A., Donfield, S. M., Maeder, M. A., Amodei, N., Contant, C. F. Jr., Nelson, M. D. Jr., Willis, J. K., & the Hemophilia Growth and Development Study (1998). Hemophilia Growth and Development Study: Relationships between neuropsychological, neurological, and MRI findings at baseline. Journal of Pediatric Psychology, 23, 45-56.[Abstract/Free Full Text]

Stehbens, J. A., Loveland, K. A., Bordeaux, J. D., Contant, C., Schiller, M., Scott, A., Moylan, P. M., & Maeder, M. (1997). A collaborative model for research: Neurodevelopmental effects of HIV-1 in children and adolescents with hemophilia as an example. Children's Health Care, 26, 115-135.

Terman, L. M., & Merrill, M. A. (1937). Measuring intelligence. Boston: Houghton Mifflin.

Terman, L. M., & Merrill, M. A. (1960). Stanford-Binet Intelligence Scale: Manual for the third revision, Form L-M. Boston: Houghton-Mifflin.

Tuma, J. M., & Appelbaum, A. S. (1980). Reliability and practice effects of WISC-R IQ estimates in a normal population. Educational and Psychological Measurement, 40, 671-678.[Abstract]

Usner, D., & Fitzgerald, G. (1999). Analytical implications of changing neuropsychological test versions during a longitudinal study due to aging of a pediatric cohort [Letter to the editor]. Controlled Clinical Trials, 20, 476-478.[Web of Science][Medline]

Vance, B., Hankins, N., & Brown, W. (1987). A longitudinal study of the Wechsler Intelligence Scale for Children-Revised over a six-year period. Psychology in the Schools, 24, 229-233.

Wechsler, D. (1949). Manual for the Wechsler Intelligence Scale for Children. San Antonio: Psychological Corporation.

Wechsler, D. (1955). Manual for the Wecshler Adult Intelligence Scale. San Antonio: Psychological Corporation.

Wechsler, D. (1974). Manual for the Wechsler Intelligence Scale for Children-Revised. San Antonio: Psychological Corporation.

Wechsler, D. (1981). Wechsler Adult Intelligence Scale-Revised. San Antonio: Psychological Corporation.

Wechsler, D. (1991). Manual for the Wechsler Intelligence Scale for Children-Third Edition. San Antonio: Psychological Corporation.

Whitt, J. K., Hooper, S. R., Tennison, M. B., Robertson, W. T., Gold, S. H., Burchinal, M., Wells, R., Campbell, M., Whaley, R. A., Combest, J., & Hall, C. D. (1993). Neuropsychologic functioning of human immunodeficiency virus-infected children with hemophilia. Journal of Pediatrics, 122, 52-59.[Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Schizophr BullHome page
S. Frangou, M. Hadjulis, and A. Vourdas
The Maudsley Early Onset Schizophrenia Study: Cognitive Function Over a 4-Year Follow-Up Period
Schizophr Bull, January 1, 2008; 34(1): 52 - 59.
[Abstract] [Full Text] [PDF]


Home page
PediatricsHome page
R. Smith, K. Malee, R. Leighty, P. Brouwers, C. Mellins, J. Hittelman, C. Chase, I. Blasini, and for the Women and Infants Transmission Study Group
Effects of Perinatal HIV Infection and Associated Risk Factors on Cognitive Development Among Young Children
Pediatrics, March 1, 2006; 117(3): 851 - 862.
[Abstract] [Full Text] [PDF]


Home page
PediatricsHome page
M. A. T. Tamula, P. L. Wolters, C. Walsek, S. Zeichner, and L. Civitello
Cognitive Decline With Immunologic and Virologic Stability in Four Children With Human Immunodeficiency Virus Disease
Pediatrics, September 1, 2003; 112(3): 679 - 684.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Right arrow Disclaimer
Google Scholar
Right arrow Articles by Sirois, P. A.
Right arrow Articles by Amodei, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sirois, P. A.
Right arrow Articles by Amodei, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?