dramatically better assessment systems: advice for rttt “common assessment” rfp

29
Dramatically Dramatically Better Assessment Better Assessment Systems: Systems: Advice for Advice for RTTT “Common Assessment” RTTT “Common Assessment” RFP RFP Brian Gong Brian Gong Center for Assessment Center for Assessment Presentation for the Input Meetings Presentation for the Input Meetings Sponsored by the U.S. Department of Sponsored by the U.S. Department of Education for the “Common Assessment” Education for the “Common Assessment” RFP, “Race to the Top” funding RFP, “Race to the Top” funding November 17, 2009 Atlanta, GA November 17, 2009 Atlanta, GA

Upload: adem

Post on 06-Jan-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Dramatically Better Assessment Systems: Advice for RTTT “Common Assessment” RFP. Brian Gong Center for Assessment Presentation for the Input Meetings Sponsored by the U.S. Department of Education for the “Common Assessment” RFP, “Race to the Top” funding November 17, 2009 Atlanta, GA. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

Dramatically Dramatically Better Assessment Better Assessment Systems: Systems: Advice for Advice for

RTTT “Common RTTT “Common Assessment” RFPAssessment” RFP

Brian GongBrian GongCenter for AssessmentCenter for Assessment

Presentation for the Input Meetings Sponsored Presentation for the Input Meetings Sponsored by the U.S. Department of Education for the by the U.S. Department of Education for the

“Common Assessment” RFP, “Race to the Top” “Common Assessment” RFP, “Race to the Top” fundingfunding

November 17, 2009 Atlanta, GANovember 17, 2009 Atlanta, GA

Page 2: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

2Gong – USED Common Assessment RFP Input Mtg – 11/17/09

My Main PointMy Main Point The future of assessment in the United The future of assessment in the United

States will be shaped by what gets funded States will be shaped by what gets funded in this “Common Assessment” RFP.in this “Common Assessment” RFP.

USED should shape the RFP and fund it USED should shape the RFP and fund it with a longer-term view of having in place with a longer-term view of having in place dramatically better assessment systems in dramatically better assessment systems in ten years.ten years. When USED has to compromise, choose When USED has to compromise, choose

longer-term investments over short-term gainslonger-term investments over short-term gains Say very clearly what you want in the RFPSay very clearly what you want in the RFP Help foster good responses to the RFPHelp foster good responses to the RFP

Page 3: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

3Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Personal Personal recommendationsrecommendations

Hedge bets by funding multiple Hedge bets by funding multiple ways to do multi-state common ways to do multi-state common assessment, especially high schoolassessment, especially high school

Invest in six “game changers” that Invest in six “game changers” that could make assessment could make assessment dramatically better within a dramatically better within a decade, decade, but should not be framed but should not be framed as being operationally implemented as being operationally implemented on the short time schedule (“2012”)on the short time schedule (“2012”)

Help foster good responses to the Help foster good responses to the RFP and afterRFP and after

Page 4: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

4Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Short-term and Longer-Short-term and Longer-term Investmentsterm Investments

Common Assessment RFP should fundCommon Assessment RFP should fund For implementation by 2012, For implementation by 2012, what we what we

already know how to do already know how to do in large-scale in large-scale assessment butassessment but With new set of content standardsWith new set of content standards With groups of multiple states (difficult to do)With groups of multiple states (difficult to do)

For development through 2015, For development through 2015, what we what we do not know how to do well at scale, do not know how to do well at scale, but which has potential to lead to but which has potential to lead to dramatically better assessment dramatically better assessment systemssystems

Page 5: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

5Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Implementing a new multi-Implementing a new multi-state summative state summative

assessment takes yearsassessment takes years2009 2010 2011 2012 2013 2014 201550

state systems, NAEP,

TIMMS, PISA,

PERLS, many LEA

systems, NRTs,

ACT/SAT, college’s

tests, etc.Award RFP(s)

(9/2009)

Test Specifi-cations; Develop

Items; Use specs,

reports, equating

design, administrati

on agreements,

etc. (2009-10)

Pilot Test

Items, promulgate high

stakes policies,

etc. (2010-

11)

First operatio

nal administra-tion & reportin

g, etc. (2011-

12)

Second operation

al administr

a-tion; first

report using

growth, etc. (2012-

13)

Fourth operatio

nal administ

ra-tion; first

graduating high

school class,

etc. (2014-

15)

Fast Implementation of RFP: 2012 (e.g., multi-state assessments with common content standards, “Peer Review” quality of things we know how to do)

And aligning curriculum, instruction, accountability, and supports takes longer.

Page 6: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

6Gong – USED Common Assessment RFP Input Mtg – 11/17/09

RFP: Specify, Specify, RFP: Specify, Specify, SpecifySpecify

USED should specify its USED should specify its purpose, theory purpose, theory of action,of action, and and how the assessment how the assessment results will be usedresults will be used so responders know so responders know the big picturethe big picture

Specify what is wanted as an Specify what is wanted as an deliverable deliverable and the set parameters for responders’ and the set parameters for responders’ creative proposals (e.g., time schedule)creative proposals (e.g., time schedule)

Specify the Specify the meansmeans an outcome should be an outcome should be done if USED really wants a specific meansdone if USED really wants a specific means

Page 7: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

7Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Some Model Systems for Some Model Systems for 20122012

Cross-state comparisonsCross-state comparisons Standards-based interpretationStandards-based interpretation Inform better instructionInform better instruction Rapid turn-aroundRapid turn-around Measure growthMeasure growth Measure student performance for Measure student performance for

teacher/administrator evaluationteacher/administrator evaluation

Page 8: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

8Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Cross-state ComparisonsCross-state Comparisons (2012)(2012)

Purpose, TOA, Use: Hold students, Purpose, TOA, Use: Hold students, schools, LEAs, and states accountable to schools, LEAs, and states accountable to a common performance standard by a common performance standard by triggering sanctionstriggering sanctions

Outcome: Statistically robust reports of Outcome: Statistically robust reports of performance on common metric with no performance on common metric with no “wiggle room” – stronger than current “wiggle room” – stronger than current NAEP mapping studiesNAEP mapping studies

Means: Same content standards, same Means: Same content standards, same test specifications, same performance test specifications, same performance standards, single assessment across standards, single assessment across states, same administration procedures, states, same administration procedures, strong equating across yearsstrong equating across years

Page 9: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

9Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Standards-based Standards-based InterpretationInterpretation (2012) (2012)

Purpose, TOA, Use: Promote equity Purpose, TOA, Use: Promote equity through holding students and schools through holding students and schools to common opportunity-to-learn to common opportunity-to-learn (content standards) and minimal (content standards) and minimal performance standardsperformance standards

Outcome: Valid reports of performance Outcome: Valid reports of performance related to the designated standardsrelated to the designated standards

Means: Aligned, grade-level only (?), Means: Aligned, grade-level only (?), matrix-sampled (?), high school (?), matrix-sampled (?), high school (?), SWD (?), ELL (?)SWD (?), ELL (?)

Page 10: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

10Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Inform Better Inform Better Instruction Instruction (2012)(2012)

Purpose, TOA, Use: Assess more complex Purpose, TOA, Use: Assess more complex and applied learning (monitor); model and and applied learning (monitor); model and encourage instruction (drive)encourage instruction (drive)

Outcome: Incrementally better, more valid Outcome: Incrementally better, more valid and reliable measurement of higher-order, and reliable measurement of higher-order, complex student performances (?); more complex student performances (?); more widespread “good” instruction (?)widespread “good” instruction (?)

Means: Curriculum-embedded assessments Means: Curriculum-embedded assessments (e.g., standardized units, portfolios, (e.g., standardized units, portfolios, graduation projects) (?); curricula with graduation projects) (?); curricula with (local) matched assessments (?)(local) matched assessments (?)

Page 11: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

11Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Rapid Turn-aroundRapid Turn-around (2012) (2012)

Purpose, TOA, Use: Promote improvement Purpose, TOA, Use: Promote improvement through rapid feedback to inform actionsthrough rapid feedback to inform actions

Outcome: Reports of performance useful to Outcome: Reports of performance useful to decisions and actions, in appropriate decisions and actions, in appropriate timeframe (distinguish actions that are multi-timeframe (distinguish actions that are multi-year or annual monitoring from annual rich year or annual monitoring from annual rich content analysis from shorter-term uses, down content analysis from shorter-term uses, down to course grades and student instructional to course grades and student instructional feedback)feedback)

Means: Trade-off speed for quality, cost: Means: Trade-off speed for quality, cost: greater reliance on multiple-choice/machine-greater reliance on multiple-choice/machine-scored; trade-off centralized standardization scored; trade-off centralized standardization for complex performances, local scoring; for complex performances, local scoring; ignore administration variations (e.g., missing ignore administration variations (e.g., missing students)students)

Page 12: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

12Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Measure Growth Measure Growth (2012)(2012)

Purpose, TOA, Use: Accountability, program Purpose, TOA, Use: Accountability, program improvement, teacher accountability?improvement, teacher accountability?

Outcome: Report of student progress over Outcome: Report of student progress over time related to what is/could be/should be: time related to what is/could be/should be: grade-level standards (?), own starting point grade-level standards (?), own starting point (?), other students (?), program supports (?), other students (?), program supports (?), “teacher’s contribution” (?); how to use (?), “teacher’s contribution” (?); how to use in accountability (?)in accountability (?)

Means: Out-of-level testing (?), adaptive Means: Out-of-level testing (?), adaptive testing (?), vertical testing (?), vertical [moderated] [moderated] scales (?), scales (?), use math to predict reading for greater use math to predict reading for greater reliability (?), pre- post-measures within reliability (?), pre- post-measures within year (?)year (?)

Page 13: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

13Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Teacher/administrator Teacher/administrator evaluation evaluation (2012)(2012)

Purpose, TOA, Use: Improve teacher Purpose, TOA, Use: Improve teacher quality by providing feedback (?); use in quality by providing feedback (?); use in accountability or other high-stakes accountability or other high-stakes decisions (?)decisions (?)

Outcome: Changes in student Outcome: Changes in student performance associated with (attributable performance associated with (attributable to ?) specific teachers, administrators, to ?) specific teachers, administrators, programsprograms

Means: many statistical approaches Means: many statistical approaches (check assumptions, limitations) (?); (check assumptions, limitations) (?); combine with other information (?)combine with other information (?)

Page 14: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

14Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Personal Personal recommendationsrecommendations

Hedge bets on multiple ways to do Hedge bets on multiple ways to do multi-state common assessment, multi-state common assessment, especially high schoolespecially high school

Invest in six “game changers” that Invest in six “game changers” that could make assessment could make assessment dramatically better within a dramatically better within a decade, decade, but should not be framed but should not be framed as being operationally implemented as being operationally implemented on the short time schedule (“2012”)on the short time schedule (“2012”)

Help foster good responses to the Help foster good responses to the RFP and afterRFP and after

Page 15: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

15Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Invest in “Game Invest in “Game Changers” - 1Changers” - 1

Develop technology that provides Develop technology that provides evidence of more complex evidence of more complex knowledge and skills (i.e., more knowledge and skills (i.e., more valid)valid)

E.g., interactive simulations, non-E.g., interactive simulations, non-academic knowledge and skillsacademic knowledge and skills

Only use technology with an Only use technology with an evidence-centered design approach to evidence-centered design approach to maintain construct relevance, most maintain construct relevance, most studentsstudents

Page 16: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

16Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Invest in “Game Invest in “Game Changers” - 2Changers” - 2

Develop technology for validity

Develop complex performance Develop complex performance assessmentassessment Specify extended learning and content, Specify extended learning and content,

real application contexts, student real application contexts, student choicechoice

Develop credible (local) administration Develop credible (local) administration and scoringand scoring

Include all students (and teachers)Include all students (and teachers) Develop means of certifying validity Develop means of certifying validity

and reliability, and of combining with and reliability, and of combining with other evidenceother evidence

Page 17: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

17Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Invest in “Game Invest in “Game Changers” - 3Changers” - 3

Develop technology for validity Develop complex performance assessment

Develop curricula that specify Develop curricula that specify “what” and “how” of learning, and “what” and “how” of learning, and associated local assessment systemsassociated local assessment systems Interim and formative assessments are Interim and formative assessments are

needed to inform learning directlyneeded to inform learning directly Real assessment problem is informing Real assessment problem is informing

“What should be done next?” – cannot “What should be done next?” – cannot solve without curriculum and solve without curriculum and teacher/administrator expertiseteacher/administrator expertise

Page 18: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

18Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Invest in “Game Invest in “Game Changers” - 4Changers” - 4

Develop technology for validity Develop complex performance assessment Develop curricula, local assessment systems

Develop new measurement models Develop new measurement models and technical criteria for assessments and technical criteria for assessments of complex knowledge and skillsof complex knowledge and skills We know current models’ assumptions We know current models’ assumptions

and limitations; do not impose on and limitations; do not impose on innovations! innovations! (Example: reliability vs. validity (Example: reliability vs. validity of complex performances; cognitive vs. of complex performances; cognitive vs. unidimensional models)unidimensional models)

Page 19: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

19Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Invest in “Game Invest in “Game Changers” - 5Changers” - 5

Develop technology for validity Develop complex performance assessment Develop curricula, comprehensive assessment systems Develop new measurement models and technical criteria

Develop better accountability Develop better accountability models and support better use of models and support better use of assessment results for program assessment results for program improvementimprovement Assessments, assessment use, and Assessments, assessment use, and

instruction are being distorted by our instruction are being distorted by our current accountability modelcurrent accountability model

Page 20: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

20Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Invest in “Game Invest in “Game Changers” - 6Changers” - 6

Develop technology for validity Develop complex performance assessment Develop curricula, local assessment systems Develop new measurement models and technical criteria Develop better models of accountability and program

improvement

Develop model specifications for Develop model specifications for a coherent comprehensive a coherent comprehensive assessment system that assessment system that incorporates above fiveincorporates above five e.g., NAEP, state, e.g., NAEP, state,

Page 21: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

21Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Invest in “Game Invest in “Game Changers” - 7Changers” - 7

Technology for validity Complex performance assessment Curricula & comprehensive

assessment systems New measurement models and New measurement models and

technical criteriatechnical criteria Better accountability models and Better accountability models and

support for program improvementsupport for program improvement

Page 22: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

22Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Personal Personal Recommendation - 2Recommendation - 2

Invest in five assessment “game changers” Invest in five assessment “game changers” Hedge bets on multiple ways to do Hedge bets on multiple ways to do

multi-state “2012” common multi-state “2012” common assessment, especially high schoolassessment, especially high school Good current models: all MC, mixed Good current models: all MC, mixed

MC-CR, computer-based, end-of-course, MC-CR, computer-based, end-of-course, survey, etc.survey, etc.

Interwoven with state policies (e.g., high Interwoven with state policies (e.g., high school exit requirements)school exit requirements)

Help foster good responses to the RFP Help foster good responses to the RFP and afterand after

Page 23: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

23Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Hedge bets on 2012 Hedge bets on 2012 assessmentassessment

End of course AND Grade 11 surveyEnd of course AND Grade 11 survey Computer-based AND paper & pencilComputer-based AND paper & pencil All multiple choice AND modest short CR All multiple choice AND modest short CR

AND larger amount and more extensive CRAND larger amount and more extensive CR

Fund multiple “common content standards”Fund multiple “common content standards” To find out costs and benefits of multi-state To find out costs and benefits of multi-state

common assessmentscommon assessments Because no one set of content standards is Because no one set of content standards is

clearly superiorclearly superior Because no one approach is clearly superiorBecause no one approach is clearly superior Because reporting on a common score metric is Because reporting on a common score metric is

less importantless important

Page 24: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

24Gong – USED Common Assessment RFP Input Mtg – 11/17/09

RFP Portfolio of AwardsRFP Portfolio of Awards

Multiple (around 8) strong models Multiple (around 8) strong models that represent advances that can be that represent advances that can be implemented strongly by 2012 and implemented strongly by 2012 and that help get to the longer-term that help get to the longer-term goalgoal Consider strategy: Do not fund strong Consider strategy: Do not fund strong

models that will be adopted even if not models that will be adopted even if not fundedfunded

Multiple (perhaps 12) strong “game Multiple (perhaps 12) strong “game changer” awardschanger” awards

Page 25: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

25Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Personal Personal Recommendation - 3Recommendation - 3

Invest in four assessment “game changers”Invest in four assessment “game changers” Hedge bets on multiple ways to do multi-state Hedge bets on multiple ways to do multi-state

common assessment, especially high schoolcommon assessment, especially high school

Help foster good responses to Help foster good responses to the RFP and afterthe RFP and after If USED wants certain outcomes If USED wants certain outcomes

of states working together, then of states working together, then promote leadership to make that promote leadership to make that happen among states, NGOs, test happen among states, NGOs, test vendors, etc.vendors, etc.

Page 26: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

26Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Fostering Strong RFP Fostering Strong RFP ResponsesResponses

Provide clear RFP specs and different awards Provide clear RFP specs and different awards for “2012 implementation” and “game for “2012 implementation” and “game changers”changers”

If USED wants states to have vendor partners If USED wants states to have vendor partners in their RFP responses, need to indicate that in their RFP responses, need to indicate that early and facilitate it well (vs. states’ issuing an early and facilitate it well (vs. states’ issuing an RFP)RFP)

USED should think about what states who USED should think about what states who don’t get RTTT common assessment funds will don’t get RTTT common assessment funds will dodo

USED should think how what it funds will be USED should think how what it funds will be adopted after RTTT and how that will shape adopted after RTTT and how that will shape what is available in the futurewhat is available in the future

Page 27: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

27Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Envision Intended & Envision Intended & Unintended ConsequencesUnintended Consequences

What if in 2012 there were five widely used What if in 2012 there were five widely used assessments, all aligned to the same assessments, all aligned to the same common content standardscommon content standards Four were commercially available from current Four were commercially available from current

test publishers (like the Achieve/Pearson test publishers (like the Achieve/Pearson Algebra 2 end-of-course exam)Algebra 2 end-of-course exam)

One was available by joining a consortium (like One was available by joining a consortium (like the WIDA ELP exams)the WIDA ELP exams)

States were purchasing elementary math from States were purchasing elementary math from one vendor and high school English from one vendor and high school English from another vendoranother vendor

What if there were only one assessment What if there were only one assessment being used? What if there were 46?being used? What if there were 46?

Page 28: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

28Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Envision Intended & Envision Intended & Unintended Consequences – Unintended Consequences –

22 What if in 2012 each commercially available What if in 2012 each commercially available assessment came in five versions:assessment came in five versions: An all multiple-choice, computer-administered short form An all multiple-choice, computer-administered short form

that took 20 minutes and cost $3/per studentthat took 20 minutes and cost $3/per student An all multiple-choice, computer or paper & pencil form An all multiple-choice, computer or paper & pencil form

that took 50 minutes and cost $7/per studentthat took 50 minutes and cost $7/per student A computer or p & p version that took 120 minutes, had A computer or p & p version that took 120 minutes, had

40 multiple choice, 8 short constructed response, and 4 40 multiple choice, 8 short constructed response, and 4 extended constructed response items and cost $15/per extended constructed response items and cost $15/per studentstudent

A computer of p & p version that took 150 minutes, had A computer of p & p version that took 150 minutes, had 40 multiple choice, 4 extended constructed response, 40 multiple choice, 4 extended constructed response, and 2 long constructed response items and cost $60/per and 2 long constructed response items and cost $60/per studentstudent

A version that included a standardized test like option 3 A version that included a standardized test like option 3 and had a curriculum-embedded project and other and had a curriculum-embedded project and other performance evidence that was centrally audited and performance evidence that was centrally audited and cost $200/per studentcost $200/per student

Page 29: Dramatically Better Assessment Systems:  Advice for RTTT “Common Assessment” RFP

29Gong – USED Common Assessment RFP Input Mtg – 11/17/09

Center for AssessmentCenter for Assessment

www.nciea.orgwww.nciea.org

Brian GongBrian Gong

[email protected]@nciea.org

For more information:For more information: