Journal of Pediatric Psychology, Vol. 25, No. 3, 2000, pp. 179-183
© 2000 Society of Pediatric Psychology
Brief Report: Cautions Against Using the Stanford-Binet-IV to Classify High-Risk Preschoolers
1 The Citadel, 2 Early Intervention Research Institute, Utah State University, 3 Temple University
All correspondence should be sent to Conway F. Saylor, Department of Psychology, The Citadel, 171 Moultrie Street, Charleston, South Carolina 29409. E-mail: conway.saylor{at}citadel.edu .
| Abstract |
|---|
|
|
|---|
Objective: To explore concurrent and predictive validity of the Stanford-Binet: Fourth Edition (SB-IV) by comparing scores on the SB-IV with scores from the Battelle Developmental Inventory (BDI) and later achievement scores in preschoolers at risk due to very low birthweight, and/or intraventricular hemorrhage (IVH) and other medical complications.
Methods: At ages 3,4, and 5, 92 preschoolers were tested with the SB-IV and BDI as part of an 8-year early intervention follow-up.
Results: The SB-IV and BDI concurrent correlations at ages 3, 4, and 5 were statistically significant (r =.73-.78, p <.0001), as were predictive correlations (r =.58-.85, p <.0001). However, the BDI and SB-IV failed to place the children in the same categories for intervention services. With the BDI as the comparison measure, SB-IV failed to detect 87% of the children who were "delayed" (by BDI) at age 3 and 50% of the "delayed" children at age 5.
Conclusions: Caution is recommended when using the SB-IV to assess high risk for early intervention eligibility.
Key words: Stanford-Binet:IV; NICU; preschool assessment; intraventricular hemorrhage; early intervention eligibility.
| Introduction |
|---|
|
|
|---|
Pediatric psychologists are increasingly called on to participate in interdisciplinary teams that follow high-risk children after their discharge from a neonatal intensive care unit (NICU). The majority of the infants discharged from tertiary care center NICUs were born prematurely and are at risk for developmental delays due to low birthweight and the medical complications that accompany it. Whether initially collected for clinical or research purposes, the psychologist's assessment data may be utilized by families, referring physicians, and early intervention service providers to determine the child's service eligibility. As experts in psychometrics, psychologists must take the lead in critically evaluating the instruments used with this high-risk clinical population.
The Stanford-Binet Intelligence Scale, 4th ed. (SB-IV;
Thorndike, Hagen, & Satler,
1986
) has been widely accepted and included among tests endorsed
for assessment of children's cognitive ability for preschool children, in
spite of a shortage of studies examining its validity with younger
populations. Sattler (1990
)
contends that the SB-IV is one of the best intelligence tests available
because it has been well normed and has excellent reliability and validity in
the general population. Concurrent validity studies suggest the SB-IV is
likely to yield composite scores similar to those provided by other acceptable
measures of cognitive functioning, like the WISC-R, WAIS-R, and Form L-M
(Hollinger & Baldwin, 1990
;
Sattler, 1990
). However,
Flanagan and Alfonso (1995
)
urge caution when selecting the SB-IV as a measure of intelligence for
preschoolers, especially with those who may be developmentally delayed, as the
subtests on the SB-IV recommended for children age two or three have
inadequate floors. In fact, only at the 5-year age level do all the
recommended subtests have acceptable floors
(Flanagan & Alfonso,
1995
).
This study examined the validity of the SB-IV for classifying NICU graduates assessed during their preschool years for developmental delays. Examining the relationship between the SB-IV and more in-depth developmental evaluations in a clinical sample followed longitudinally enabled us to examine the SB-IV's validity both concurrently (agreement and correlation with other measures given at the same time) and predictively (correlation with intelligence and achievement measures 3-5 years later).
| Method |
|---|
|
|
|---|
Participants
The participants were 92 preschool children who experienced intraventricular hemorrhage (IVH), and/or birthweight less than 1000 grams, along with other medical complications, secondary to premature birth. All children had been patients in the NICU and were being followed longitudinally as part of a larger study of early intervention effectiveness (for details on studies and subject characteristics see Saylor, Casto, & Huntington, 1996
Attrition due to both family and project variables led to different numbers
of children being sampled each year, ages 3-8. Of the 92 children testable on
the SB-IV at age 3 (this excluded 17 from the original sample who were
untestable due to motor or sensory impairment), 82 returned for testing at
year 5. In year 4, only 72 were tested due to staffing changes and diminished
resources, which led to 10 children being temporarily lost to follow-up. In
year 7, 75 returned for testing on the Woodcock Johnson Test of
Achievement-Revised (WJ-R). As funding ran out halfway through the eighth year
of the project, only 60 children had reached their eighth birthdays in time to
be tested. Analyses reported elsewhere
(Boyce, Saylor, & Alexandrova,
1996
) showed that the children who stayed throughout the project
were comparable medically and demographically to the original sample.
Measures and Procedures
The 3-, 4-, and 5-year follow-up assessments of early intervention
participants included the Battelle Development Inventory (BDI;
Newborg, Stock, Wnek, Guidubaldi, &
Svinick, 1984
) and the SB-IV
(Thorndike, Hagen, & Sattler,
1986
), administered within an hour of each other by master's level
clinical child/pediatric psychology assistants who were trained to 95%
reliability on test administration. At ages 6, 7, and 8 age-appropriate
subscales of the WJ-R (Woodcock &
Johnson, 1989
) measured skills and acquired knowledge. Subjects
returning at age 8 also were administered the Wechsler Intelligence Scale for
Children-Revised (Wechsler,
1974
). At the time of testing, this study was not anticipated, so
order of tests administered was not systematically recorded or
counterbalanced.
| Results |
|---|
|
|
|---|
Correlations Among Measures
Table I presents the Pearson product-moment correlation coefficients between the SB-IV at ages 3, 4, and 5 with concurrent and future developmental and achievement scores. Concurrent analyses of the SB-IV Composite and the Battelle Developmental Quotient revealed statistically significant correlations (r =.73-.78, p <.0001), suggesting the two measures yield similar findings for high-risk preschoolers ages 3 to 5. In addition, the predictive validity analyses found significant correlations between SB-IV (at 3, 4, and 5) and later achievement and intelligence scores (at ages 7 and 8) ranging from.58 to.85 (p <.0001). Correlations were also examined between the BDI and other measures of intelligence and achievement, concurrently and predictively. All correlations were significant and comparable to those of the SB-IV, as summarized in Table I.
|
Co-positivity and Co-negativity
With the BDI serving as the reference standard, the high-risk children were
classified as "delayed" or "not delayed" on both SB-IV
and BDI. In the first set of analyses, "delay" was operationalized
as one standard deviation (SD) below the mean, while in the second
set, we used the more rigorous cut-off of two SDs. These cut-offs
were chosen because, while some states use a 1.5 SD cut-off for early
intervention services, others require 1 SD in two or more domains or
2 SDs in one domain. The SB-IV and BDI demonstrated good
co-negativity at both one and two SDs (1 SD = 15 pt., 2
SD = 30 pt.) below the mean. That is, 100% of the children deemed
"not delayed" by the BDI (score
70) were found to be not
delayed on the SB-IV. When a more moderate cut-off (
85) was used, more
than 85% of the "non-delayed" children were identified as not
delayed. In contrast, the SB-IV demonstrated extremely poor co-positivity with
the BDI as the reference standard. At age three, the SB-IV correctly
identified only 13% of the children found by the BDI to be
"delayed" (DQ < 70) or 44% of those found to be mildly delayed
(DQ <85). There was a modest improvement in the co-positivity from age
three to age five, but even at age five, the SB-IV was only able to identify
50% of high-risk children with DQs less than 70 on the BDI (see
Table II).
|
| Discussion |
|---|
|
|
|---|
This study examined the validity of the SB-IV for classifying preschoolers at risk for developmental delays. Concurrently, there were strong positive correlations between the SB-IV and the BDI. However, further investigation revealed they did not classify "high-risk" children in similar intervention eligibility categories. With the BDI as the reference standard, the SB-IV failed to identify any children with DQs less than 55 until five years of age and only agreed that 3 of the 23 "delayed" children were service eligible. This supports findings from Flanagan and Alfonso (1995
This finding is particularly troubling because data generally support the
idea that developmental outcomes are better for children referred early (in
the first 3 years) versus later (over age 3)
(Casto & Mastropieri,
1986
), and federal laws mandate timely identification of children
needing services (e.g., Public Law
99-457
, 1986).
This study shows how two measures can be highly correlated (suggesting good concurrent validity) and can both correlate with future outcomes (suggesting good predictive validity) but give different clinical dispositions. In this example, using the SB-IV as the basis for early intervention eligibility (compared to the BDI) would have yielded an unacceptably high rate of underreferral of high-risk NICU graduates at preschool ages.
The SB-IV's preschool age floor problems should be assessed against other criteria besides the BDI and in other populations besides NICU graduates. The absence of a universal "gold standard" for measuring preschoolers' development and the potential for intervention placement and alternative "criterion" test scores to be influenced by one another make it difficult to arrive at the best test for high-risk preschoolers. However, the bottom line in this multisite sample of preschool children with known perinatal risk factors is that 87% of the 3-year-olds eligible for intervention services based on one test would have been ineligible if SB-IV scores had been required for placement.
This study suggests that the SB-IV, though useful for other clinical purposes, may not be a good test for evaluation of early intervention candidates, compared to other tests. In high-risk populations, the SB-IV may miss a large percentage of low functioning preschoolers who might benefit the most from early intervention programs.
| Acknowledgments |
|---|
This work was supported by funds from the U.S. Department of Education awarded to Utah State University's Early Intervention Research Institute (contract HS90010001). We thank Kim Foster, Sherri Stokes, Judy Moore, Sandy Glover, Kat North, Cora Price, Glen Casto, Karl White, Mark Innocenti, and Diane Behl for their participation and consultation.
Received March 13, 1998; revision received October 19, 1998; revision received May 20, 1999; accepted June 10, 1999
| References |
|---|
|
|
|---|
Boyce, G., Saylor, C., & Alexandrova, E. (1996, April). Temperament & mother-child interaction in families with premature medically fragile infants. Paper presented at the International Conference on Infant Studies, Providence, RI.
Boyce, G., Smith, T. M., & Immel, N. (1993). Early intervention with medically fragile infants: Investigating the age-at-start question. Journal of Early Education and Development, 4, 290-305.
Casto, G., & Mastropieri, M. A. (1986). The efficacy of early intervention programs: A meta-analysis. Exceptional Children, 52, 417-424.[Web of Science][Medline]
Flanagan, D. P., & Alfonso, V. C. (1995). A critical review of the technical characteristics of new and recently revised intelligence tests for preschool children. Journal of Psychoeducational Assessment, 13, 66-90.
Hollinger, C. J., & Baldwin, C. (1990). The Stanford-Binet, fourth edition: A small study of concurrent validity. Psychological Reports, 66, 1131-1336.
Newborg, J., Stock, J., Wnek, L., Guidubaldi, J., & Svinicki, J. (1984). Battelle Developmental Inventory: Examiner's manual. Dallas: DLM/Teacher Resources.
Public Law 99-457, Education of the Handicapped Act Amendments of 1986.
Sattler, J. M. (1990). Assessment of children. 3rd ed. San Diego, CA: Jerome Sattler.
Saylor, C. F., Casto, G., & Huntington, L. (1996).
Predictors of developmental outcomes for medically fragile early intervention
participants. Journal of Pediatric Psychology,
21, 869-887.
Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). The Stanford-Binet Intelligence Scale: Fourth edition. Chicago: Riverside.
Wechsler, D. (1974). Manual for the Wechsler Intelligence Scale for Children-Revised. New York: Psychological Corporation.
Woodcock, R. W., & Johnson, M. B. (1989). Woodcock-Johnson tests of achievement. Chicago: Riverside.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
70 vs. 85 Being Referred