WISC IV Technical Report #2 Psychometric Properties. Overview



Similar documents
Efficacy Report WISC V. March 23, 2016

Early Childhood Measurement and Evaluation Tool Review

WISC IV and Children s Memory Scale

Using the WASI II with the WISC IV: Substituting WASI II Subtest Scores When Deriving WISC IV Composite Scores

Interpretive Report of WAIS IV Testing. Test Administered WAIS-IV (9/1/2008) Age at Testing 40 years 8 months Retest? No

Estimate a WAIS Full Scale IQ with a score on the International Contest 2009

Introducing the WAIS IV. Copyright 2008 Pearson Education, inc. or its affiliates. All rights reserved.

Standardized Tests, Intelligence & IQ, and Standardized Scores

Early Childhood Measurement and Evaluation Tool Review

General Ability Index January 2005

The Relationship between the WISC-IV GAI and the KABC-II

The child is given oral, "trivia"- style. general information questions. Scoring is pass/fail.

WISC IV Technical Report #1 Theoretical Model and Test Blueprint. Overview

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

It s WISC-IV and more!

G e n e ra l A b i l i t y I n d ex January 2005

Harrison, P.L., & Oakland, T. (2003), Adaptive Behavior Assessment System Second Edition, San Antonio, TX: The Psychological Corporation.

Development of the WAIS III: A Brief Overview, History, and Description Marc A. Silva

Essentials of WAIS-IV Assessment

Interpretive Report of WISC-IV and WIAT-II Testing - (United Kingdom)

Do the WAIS-IV Tests Measure the Same Aspects of Cognitive Functioning in Adults Under and Over 65?

Overview of the Wechsler Preschool and Primary Scale of Intelligence-Fourth Edition

WISC-III and CAS: Which Correlates Higher with Achievement for a Clinical Sample?

2 The Use of WAIS-III in HFA and Asperger Syndrome

AUTISM SPECTRUM RATING SCALES (ASRS )

Published online: 29 Jul 2014.

standardized tests used to assess mental ability & development, in an educational setting.

A Comparison of Low IQ Scores From the Reynolds Intellectual Assessment Scales and the Wechsler Adult Intelligence Scale Third Edition

DOCUMENT RESUME ED PS

Non-Linear Regression Samuel L. Baker

Validity of the WISC IV Spanish for a Clinically Referred Sample of Hispanic Children

TEST REVIEW. Purpose and Nature of Test. Practical Applications

Test Reliability Indicates More than Just Consistency

PSY554 Fall 10 Final Exam Information

Intelligence. My Brilliant Brain. Conceptual Difficulties. What is Intelligence? Chapter 10. Intelligence: Ability or Abilities?

An examination and comparison of the revisions of the Wechsler Intelligence Scale for Children

ω h (t) = Ae t/τ. (3) + 1 = 0 τ =.

High Learning Potential Assessment

PPVT -4 Publication Summary Form

The WISC III Freedom From Distractibility Factor: Its Utility in Identifying Children With Attention Deficit Hyperactivity Disorder

SAMPLE DESIGN RESEARCH FOR THE NATIONAL NURSING HOME SURVEY

WMS III to WMS IV: Rationale for Change

Technical Report. Overview. Revisions in this Edition. Four-Level Assessment Process

May Minnesota Undergraduate Demographics: Characteristics of Post- Secondary Students

Feifei Ye, PhD Assistant Professor School of Education University of Pittsburgh

Technical Report #2 Testing Children Who Are Deaf or Hard of Hearing

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

National Psychology Examination - Assessment Domain Additional Resources

Stationary (St)--This subtest measures a child's ability to sustain control of the body within its center of gravity and retain equilibrium.

Statistics. Measurement. Scales of Measurement 7/18/2012

WHAT IS A BETTER PREDICTOR OF ACADEMIC SUCCESS IN AN MBA PROGRAM: WORK EXPERIENCE OR THE GMAT?

General Symptom Measures

The Scoop on Understanding Psych Testing: What do all those numbers really mean???

MEASURES OF VARIATION

Case 4:05-cv Document 26-1 Filed in TXSD on 09/12/08 Page 1 of 5. Exhibit A

WISC-R Subscale Patterns of Abilities of Blacks and Whites Matched on Full Scale IQ

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

Measuring and Benchmarking Hospital Re-Admission Rates for Quality Improvement. Second National Medicare Readmissions Summit Cary Sennett, MD, PhD

CALCULATIONS & STATISTICS

Practice Quiz - Intelligence

Running head: WISC-IV AND INTELLECTUAL DISABILITY 1. The WISC-IV and Children and Adolescents with Intellectual Disability:

Eligibility for DD Services and the SIB-R

In 2013, U.S. residents age 12 or older experienced

Office of Institutional Research & Planning

EDUCATIONAL APPLICATIONS OF THE WISC-IV WPS TEST REPORT

Association of Specialists in Assessment of Intellectual Functioning ASAIF.

AGE AND EMOTIONAL INTELLIGENCE

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Part 2: Analysis of Relationship Between Two Variables

************************************************** [ ]. Leiter International Performance Scale Third Edition. Purpose: Designed to "measure

Number Who Chose This Maximum Amount

X = T + E. Reliability. Reliability. Classical Test Theory 7/18/2012. Refers to the consistency or stability of scores

Pittfalls in IQ Assessment in Clinical and Forensic Contexts. Agenda 6/1/2016. Is abnormal behavior culturally defined?

Assessing Bullying in New Jersey Secondary Schools

RESEARCH METHODS IN I/O PSYCHOLOGY

8 th European Conference on Psychological Assessment

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

QUADRATIC EQUATIONS EXPECTED BACKGROUND KNOWLEDGE

The Comprehensive Test of Nonverbal Intelligence (Hammill, Pearson, & Wiederholt,

Construct Equivalence as an Approach to Replacing Validated Cognitive Ability Selection Tests

Conducting Geriatric Evaluations: Using Data from WAIS-IV, WMS-IV, and ACSW4W4 Gloria Maccow, Ph.D., Assessment Training Consultant

IT S LONELY AT THE TOP: EXECUTIVES EMOTIONAL INTELLIGENCE SELF [MIS] PERCEPTIONS. Fabio Sala, Ph.D. Hay/McBer

Minorities in Higher Education Supplement. Young M. Kim

Running head: ASPERGER S AND SCHIZOID 1. A New Measure to Differentiate the Autism Spectrum from Schizoid Personality Disorder

The Youth Vote in 2012 CIRCLE Staff May 10, 2013

Investigating the genetic basis for intelligence

Additional sources Compilation of sources:

WISC-V Interpretive Considerations for Laurie Jones (6/1/2015)

RESEARCH METHODS IN I/O PSYCHOLOGY

Patterns of Strengths and Weaknesses in L.D. Identification

Test of Mathematical Abilities Third Edition (TOMA-3) Virginia Brown, Mary Cronin, and Diane Bryant. Technical Characteristics

Advanced Clinical Solutions. Serial Assessment Case Studies

Standard Deviation Estimator

The correlation coefficient

E-reader Ownership Doubles in Six Months

A Performance Comparison of Native and Non-native Speakers of English on an English Language Proficiency Test ...

Descriptive Statistics

Transcription:

Technical Report #2 Psychometric Properties June 15, 2003 Paul E. Williams, Psy.D. Lawrence G. Weiss, Ph.D. Eric L. Rolfhus, Ph.D. Figure 1. Demographic Characteristics of the Standardization Sample Compared to the U.S. Population 70 Overview This technical report is the second in a series intended to introduce the Wechsler Intelligence Scale for Children Fourth Edition (). Technical Report #1 presents the theoretical structure and test lueprint for the, as well as sutest changes from Wechsler Intelligence Scale for Children Third Edition (WISC III); Technical Report #2 presents the psychometric properties of ; Technical Report #3 addresses the instrument s clinical validity. Neurocognitive models of information processing provides the asis for the new structure of the, which replaces the traditional Veral IQ (VIQ)/Performance IQ (PIQ) dichotomy. The index scores that were supplemental in WISC III are now primary, and each has een enhanced according to contemporary research. The names of two indices have een updated in order to more accurately reflect the content measured y sutests introduced in this revision. Detailed information is provided in the Technical and Interpretive Manual. Standardization Sample The standardization sample was representative of the U.S. population of children age 6 16. The stratified random sampling plan utilized the following variales, ased on U.S. Bureau of the Census data from the March 2000 census: age, sex, race, parent education level, and geographic region. The standardization sample for the included 2,200 children who were divided y age into eleven groups, each consisting of 200 children. The figure elow shows the demographic characteristics of the total sample. 70 70 60 60 60 50 50 50 40 40 40 30 30 30 20 20 20 10 10 10 0 <8 9 11 12 13 15 >16 Parent Education Level 0 White African Hispanic Asian Other American Race/Ethnicity 0 Northeast South Midwest West Geographic Region Census a U.S. Population data are from Current Population Survey, March 2000: School Enrollment Supplemental File [CD-ROM] y U.S. Bureau of the Census, 2000, Washington, DC: U.S. Bureau of the Census (Producer/Distriutor).

Evidence of Reliaility: Internal Consistency The evidence of internal consistency reliaility for the normative sample was otained using the split-half method for all sutests except the speeded tasks (Coding, Symol Search, and Cancellation), for which the test-retest coefficients were used. Tale 1 compares the average internal consistency reliaility coefficients across ages for the WISC III and the sutest and composite scales. The reliaility coefficients for the composite scales range from.88 (Processing Speed) to.97 (Full Scale). The reliaility coefficients of the composite scales are identical to or slightly etter than corresponding composite scales in the WISC III. That these results can e appropriately generalized is supported y information otained from the special and clinical samples. Evidence of reliaility was otained utilizing the split-half method from a sample of 661 children from 16 special and clinical groups. Tale 1 provides average internal consistency reliaility coefficients of sutests for these groups. The majority of the sutest reliaility coefficients across special groups are similar to or higher than those coefficients reported for the normative sample, suggesting that the is equally reliale instrument for assessing children who are developing typically and children with clinical diagnoses. Tale 1. Reliaility Coefficients of the Sutests and Composite Scales Sutest/Composite Average WISC III r xx ª Average r xx ª Average With Special Groups r xx Block Design.87.86.90 Similarities.81.86.90 Digit Span.85.87.87 Picture Concepts.82.88 Coding.79.85 Vocaulary.87.89.92 Letter-Numer Seq..90.93 Matrix Reasoning.89.93 Comprehension.77.81.87 Symol Search.76.79 Picture Completion.77.84.89 Cancellation.79 Information.84.86.89 Arithmetic.78.88.92 Word Reasoning.80.85 Block Design No Time Bonus.84.89 Digit Span Forward.83.82 Digit Span Backward.80.84 Cancellation Random.70 Cancellation Structured.75 Veral Comprehension.94.94 Perceptual Reasoning.90.92 Working Memory.87.92 Processing Speed.85.88 Full Scale.96.97 ª Average reliaility coefficients were calculated with Fisher s z transformation. Evidence of Reliaility: Test-Retest Staility The evidence of the s test-retest staility for sutest and composite scales was evaluated with information otained from a sample of 243 children. Participants were administered the on two separate occasions, with a mean test-retest interval of 32 days. Tale 2 presents the mean sutest scaled scores and composite scores and their standard deviations. As can e seen, the data indicate that the scores are stale across time. The mean retest scores for all sutests are higher than the mean test scores from the first administration, with effect sizes ranging from.08 (Comprehension) to.60 (Picture Completion). In general, the test-retest gains are less pronounced for the veral sutests than for other sutests. This analysis is presented in five separate age ands that show that the pattern of results is similar. 2

Tale 2. Staility Coefficients of the Sutests, Process Scores, and Composite Scales All Ages First Testing Second Testing Sutest/Process Score/Composite Mean SD Mean SD r 12 Corrected r ª Standard Difference Block Design 10.0 3.0 11.2 2.9.81.82.41 Similarities 10.1 2.6 10.7 2.5.81.86.24 Digit Span 9.9 2.9 10.4 2.7.81.83.18 Picture Concepts 10.1 2.7 10.9 2.8.71.76.29 Coding 10.4 2.7 11.8 3.1.81.84.48 Vocaulary 10.1 2.3 10.4 2.4.85.92.13 Letter Numer Seq. 10.3 2.5 10.7 2.6.75.83.16 Matrix Reasoning 10.2 2.5 10.8 2.7.77.85.23 Comprehension 10.1 2.5 10.3 2.4.72.82.08 Symol Search 10.4 2.5 11.5 2.8.68.80.41 Picture Completion 10.3 2.9 12.1 3.1.82.84.60 Cancellation 10.2 3.0 11.3 3.0.78.79.37 Information 10.0 2.5 10.4 2.5.83.89.16 Arithmetic 10.1 2.8 10.7 2.5.75.79.23 Word Reasoning 10.2 2.5 11.0 2.6.75.82.31 Block Design No Time Bonus 10.1 2.9 11.2 2.7.76.78.39 Digit Span Forward 9.9 2.9 10.3 2.8.72.76.14 Digit Span Backward 10.1 2.7 10.5 2.7.67.74.15 Cancellation Random 10.0 2.9 11.0 2.9.68.72.34 Cancellation Structured 10.2 2.8 11.0 2.8.73.76.29 Veral Comprehension 100.0 11.7 102.1 11.7.89.93.18 Perceptual Reasoning 100.7 13.1 105.9 13.9.85.89.39 Working Memory 99.8 13.1 102.4 13.3.85.89.20 Processing Speed 102.4 12.6 109.5 15.2.79.86.51 Full Scale 101.0 11.7 106.6 12.5.89.93.46 ª Correlations were corrected for the variaility of the standardization sample (Allen & Yen, 1979; Magnusson, 1967). Average staility coefficients across the five age ands were calculated with Fisher's z transformation. Factor-Analytic Studies Exploratory and confirmatory factor analyses were conducted to evaluate the internal structure of the. Because the revision retained ten sutests from the WISC III, as well as including five new sutests designed to measure similar constructs, was expected to measure the same four cognitive domains as the WISC III (i.e., Veral Comprehension, Perceptual Organization, Freedom from Distractiility, and Processing Speed; Wechsler, 1991). The initial step in examining the factor structure of the was to determine if the pattern of otained results matched the hypothesized four factor structure. The staility of the factor structure was then examined across different age groups. Finally, the predicted model was tested against alternative models using confirmatory factor analytic methods. 3

Tale 3. Exploratory Factor Pattern Loadings for Core and Supplemental Sutests Ages 6:0 16:11 (N=1525) Sutest Veral Comprehension Four Factor Model Perceptual Reasoning Working Memory Processing Speed Similarities -.71 -.13 -.02 -.02 Vocaulary -.87 -.05 -.06 -.00 Comprehension -.78 -.13 -.06 -.07 Information -.71 -.08 -.11 -.06 Word Reasoning -.73 -.09 -.07 -.01 Block Design -.06 -.78 -.04 -.02 Picture Concepts -.16 -.40 -.06 -.02 Matrix Reasoning -.03 -.64 -.19 -.04 Picture Completion -.32 -.60 -.26 -.02 Digit Span -.00 -.03 -.67 -.05 Letter-Numer Seq. -.11 -.04 -.62 -.00 Arithmetic -.14 -.18 -.51 -.03 Coding -.02 -.01 -.05 -.70 Symol Search -.01 -.17 -.08 -.54 Cancellation -.01 -.09 -.11 -.65 Note: See Technical and Interpretive Manual for complete tale and further discussion. Exploratory Factor Analysis Tale 3 presents the factor analysis results for the core and supplemental sutests for all ages. The four factor structure is clearly oserved and the primary loading of each sutest is found on the expected factor. On the Veral Comprehension factor, a small secondary loading across ages is oserved for Picture Completion. To further examine the staility of the factor structure, the data in four age ands (6 7, 8 9, 10 11, 12 13, 14 16) were then analyzed separately. With minor variations, each age and supported the overall four factor structure. The factor loading for Picture Concepts for ages 6 7 was evenly split etween Veral Comprehension and Perceptual Reasoning. The response processes of younger children on this task may require more veral mediation then in older children. Aove age 11, Arithmetic clearly a working memory sutest at all age ands has a small secondary loading on Veral Comprehension and on Perceptual Reasoning. At age 10 and elow, Information exhiited a small secondary loading on the Working Memory factor. The results from the Confirmatory Factory Analysis of the total sample verified that, as compared to reasonale alternative models, the four-factor model is clearly fits the data est; this finding was also consistent across the four age ands. Evidence of Validity: Relationships to Other Measures Evidence of validity was examined y determining the relationship etween the and the following measures: WISC III, WPPSI III, WAIS III, WASI, WIAT II, Children s Memory Scale (CMS), Gifted Rating Scale (GRS), BarOn Emotional Quotient Inventory: Youth Version (Bar-On EQ I:YV), and Adaptive Behavior Assessment System Second Edition (ABAS II). This report presents details of the WISC III/ study. 4

Tale 4. Correlations Between Sutest and Composite Scores on the and WISC III Sutest/Composite Correlations With the WISC III Both the and the WISC III were administered in counteralanced order to 244 children from ages 6 16; the test-retest interval was 5 to 42 days. Tale 4 presents the means, standard deviations, corrected and uncorrected correlations, and standard differences. The corrected correlation etween the WISC III VIQ and VCI is.87 and.74 etween the WISC III PIQ and the PRI. The lower correlation etween PIQ and PRI reflects important changes made to this composite in. Tasks that were primarily visual and spatial (Oject Assemly, Picture Completion, Picture Arrangement, and Coding) were replaced with fluid reasoning tasks (Matrix Reasoning, Picture Concepts), making the PRI a stronger measure of fluid reasoning than the PIQ; for 5 WISC III Mean ª SD N Mean ª SD N r 12 Corrected r 12 Standard Difference c Block Design 10.6 2.8 242 11.6 3.5 242.77.77.32 Similarities 10.6 2.7 243 11.3 2.9 243.75.76.25 Digit Span 10.4 3.1 241 10.5 3.3 241.79.77.03 Picture Concepts 10.8 3.0 244 Coding 10.4 3.1 240 11.1 3.3 240.77.76.22 Vocaulary 10.6 2.5 243 10.7 2.9 243.78.82.04 Letter Numer Seq 10.4 3.1 244 Matrix Reasoning 10.5 2.9 244 Comprehension 10.6 2.6 241 11.0 3.1 241.60.62.14 Symol Search 10.6 2.8 237 11.8 3.8 237.68.67.36 Picture Completion 11.1 3.0 244 11.8 2.8 244.62.64.24 Cancellation 9.6 3.2 239 Information 10.7 2.6 244 11.0 3.1 244.82.83.10 Arithmetic 10.0 2.9 133 9.8 3.2 133.74.74.07 Word Reasoning 10.7 2.6 244 VCI/VIQ 103.0 12.3 239 105.4 13.8 239.83.87.18 PRI/PIQ 103.9 14.0 242 107.3 14.9 242.73.74.24 WMI/FDI 101.5 15.3 240 103.0 15.9 240.74.72.10 PSI/PSI 102.7 15.1 232 108.2 16.3 232.81.81.35 FSIQ/FSIQ 104.5 14.0 233 107.0 14.4 233.87.89.18 VCI/VCI 102.9 12.3 238 106.0 13.6 238.85.88.24 PRI/POI 103.9 14.0 241 106.9 14.6 241.70.72.21 Note: Correlations were computed separately for each order of administration in counteralance design and corrected for the variaility of the standardization sample (Guilford & Fruchter, 1978). a The values in the Mean columns are the average of the means of the two administration orders. The weighted average across oth administration orders was otained with Fisher's z transformation. c The Standard Difference is the difference of the two test means divided y the square root of the pooled variance, computed using Cohen's (1996) Formula 10.4. this reason, a moderate correlation was expected. The WISC III FSIQ and the FSIQ correlate highly (r =.89). As anticipated, the older WISC III norms provided slightly inflated estimates for today s children. The overall difference etween the WISC III and FSIQ scores is 2.5 points, with WISC III scores the higher of the two. Of the WISC III scores, processing speed tasks showed the most inflation; the least inflation was oserved on the working memory tasks. As Tale 4 shows, the WISC III PSI mean is 5.5 points higher then the PSI mean and the WISC III FDI mean is 1.5 points higher than the WMI mean. These results are consistent with the Flynn Effect (Flynn, 1984, 1987).

in Comparison to WISC III Tale 5 provides the expected ranges of the Composite scores for selected WISC III IQ and Index scores. These ranges are relatively narrow near the middle of the IQ score distriution (i.e., 100) and wider at the upper and lower score levels. The and WISC III PRI and PIQ scores can e expected to differ more than VCI and VIQ due to the changes in the construct measured y PRI as compared to PIQ. Note that these are 95% confidence intervals ased on a non-clinical sample; special education and other clinical children may fall outside of these ranges when they are retested. Practitioners should keep in mind that these are average differences, and that an individual child who has een administered the WISC III and is retested with the may score more or less than 2.5 points lower on the FSIQ, as compared to his or her previous WISC III scores. When retesting clinical or special education students, many factors can contriute to score differences (for example, the compound effects of the disorder or disaility with increased educational and environmental demands as the child ages). Summary Tale 5. Ranges of Expected Composite Scores for Selected WISC III Composite Composite Score Range WISC III IQ Score VCI PRI FSIQ 55 50 55 47 56 49 56 70 65 70 63 70 65 70 85 81 84 79 84 81 84 100 97 98 95 98 96 98 115 112 113 110 113 112 113 130 126 129 125 129 126 129 145 140 145 139 145 140 145 Index Score Range WISC III Index Score VCI PRI WMI PSI 55 49 55 47 57 50 a 58 a 50 a 54 a 70 65 69 63 70 65 72 61 68 85 81 83 80 84 92 86 77 82 100 96 98 95 98 97 100 93 96 115 111 113 111 113 112 115 108 111 130 126 128 125 129 126 131 123 126 145 140 144 139 145 140 147 137 142 Note: Ranges are 95% confidence intervals ased on linear equating of data (Angoff, 1984, Design II.B) for 244 children administered oth tests in counteralanced order. a The range is truncated due to minimum otainale Index scores. This technical report presents some of the asic psychometric properties of the, including information aout the demographically-representative standardization sample, internal consistency reliaility, test-retest staility, factor structure, and the correlations with and score differences from the WISC III. The interested reader is referred to the Technical and Interpretive Manual for further information. References Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Monterey, CA: Brooks/Cole. Angoff, W. H. (1984). Scale, norms, and equivalents scores. Princeton, NJ: Educational Testing Service. Cohen, J. (1996). Explaining Psychological Statistics. Pacific Grove, CA: Brooks/Cole. Flynn, J. R.(1984). The Mean IQ of Americans: Massive gains 1932 to 1978. Psychological Bulletin, 95(1), 29 51. Flynn, J. R.(1987). Massive IQ gains in 14 nations: What IQ tests really measure. Psychological Bulletin, 101(2), 171 191. Guilford, J. P., & Fruchter, B. (1978). Fundamental statistics in psychology and education (6th ed.). New York: McGraw- Hill. Maggusson, D. (1967). Test theory. Reading, MA: Addison-Wesley. Wechsler, D. (1991). Wechsler intelligence scale for children Third edition. San Antonio, TX: The Psychological Corporation. 6