+ iyp¥c=Ôiÿg ` 4ogôey# ¨q /Ï ó3jy¶ - ceec.edu.t · Õ/kp»2011p¸ ä_Á cg ` 4ogôf,gôf...

29
71 學科能力測驗非選擇題閱卷一致性之探討 ě硩譢鷨逭罜謳蚧鰫 骾á 憃嬔寋嬔茠瞺 嬔ě硩譢埧耖嬔譢蚧摷鬤ě鬤ě嶹徭啑濌逭蚧 鷨嶕犭燀4鷨㠏逭罜謳璇簋瞺氏嬔譢縡4譢 摷寑嶕燀譢逭罜謳㠏餙䳢鬤撍耹嶚慒乚礓嬔譢摷 鬤ě鬤ě鷨逭罜謳蚧餙䳢氏�餙䳢蚧㗱㠏蕂鰫嬔譢摷鬤 ě鬤ě鷨㠏猊褵埧鰩謳�蕂虒茛埧棩蠽罜謳嗻 棩埧僉棩蠽竦ヾ氏��餙䳢咲罜1囆殌譢 毈腏犭4茠氀掯熷鰨囆詘鷨逭罜謳㠏挻埗嗍罜1 塉埧螆я啑噉臁嬔譢鷨逭罜謳蚧毈瓂孎氏 嬔ě硩譢毛逭罜謳 ________________________________ 貪褒殊ㄨ多斃劣斃佬碩弛臼蚱茶

Upload: dodien

Post on 07-Aug-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

71

學科能力測驗非選擇題閱卷一致性之探討

________________________________

72

學科能力測驗非選擇題閱卷一致性之探討

The Rating Consistency of General Scholastic Ability Test

Ming-Chiu Chang College Entrance Examination Center

Abstract

Rating consistency of The General Scholastic Ability Test (GSAT) has been

one of the most concerning issues to public. However, researches on rating

consistency of non- multiple-choice items are not sufficient since the GSAT is a

large-scale high-stake exam. The purpose of this study is to validate the rating

consistency of GSAT. The approaches to validate the rating consistency include

generalizability theory and rating consensus. The results show that GSAT, as a

large-scale high-stake exam, has high rating consistency. This paper not only

presents the evidence on the good rating consistency of GSAT, but it also provides

references on how to monitor and improve rating consistency of high-stake

non-multiple-choice items.

Keywords: General Scholastic Ability Test, Rating Consistency

_____________________________________________ Ming-Chiu Chang, Staff Member, College Entrance Examination Center

73

學科能力測驗非選擇題閱卷一致性之探討

2011

Generalizability theory G , Brennan, 2001

2004

74

學科能力測驗非選擇題閱卷一致性之探討

consistency stability

split-half

Kuder-Richardson Cronbach α

Pearson

Spearman

Kendall coefficient of concordance Brown Glasswell Harland 2004

The New Standards Project, CRESST Vermont

ETS

75

學科能力測驗非選擇題閱卷一致性之探討

40%~60%

80%~100% 0.7~0.8

Novak Herman Gearhart 1996

Brown et al., 2004; Novak et al., 1996

Generalizability Theory G

Brennan, 2001

G

X true score, T

undifferentiated random error, E X=T+E

universe score

ANOVA

Brennan, 2001

77

學科能力測驗非選擇題閱卷一致性之探討

facet

admissible condition of measurement

Brennan,

2001 Shavelson & Webb, 1991

Brennan, 2001

Brennan, 2001

1

p i r p objects

of measurement i r

n m

p I R 2

Brennan, 2001; Shavelson &Webb, 1991

78

學科能力測驗非選擇題閱卷一致性之探討

fixed facet random facet

G

Shavelson & Webb, 1991

25 25 25

Shavelson & Webb, 1991

generalizability coefficient index of

dependability

2ˆE

22

2 2ˆ

ˆˆ ˆ

,E 2ˆ 2ˆ

ˆ

2

2 2

ˆˆˆ ˆ

, 2ˆ 2ˆ

79

學科能力測驗非選擇題閱卷一致性之探討

2008

2008

1 99 100 /

1

99 21 12 231 150

100 18 12 222 143

1998

80

學科能力測驗非選擇題閱卷一致性之探討

A B C

A+ A A- B+ B B- C+ C C-

0

2010

2012

4 0.5

1

2 99 100

2

9 18 27 >2 >5 >8

8 20 -- >2 >5 --

81

學科能力測驗非選擇題閱卷一致性之探討

2000 3000

1

15-20

20

/

100

82

學科能力測驗非選擇題閱卷一致性之探討

99 100

20000

2008 99 100

9

18 27

54 2

2009 8

2

83

學科能力測驗非選擇題閱卷一致性之探討

2008 99 100

8

20 28

3

2 0

1. 2. 120 words

3

84

學科能力測驗非選擇題閱卷一致性之探討

interchangeable

rating

Pearson

Brennan, 2001

p i r

p i r

p i r pirX

85

學科能力測驗非選擇題閱卷一致性之探討

+

( )

(

pir

p

i

X

+

+

+

+

( )

r

pi p i

pr p r

ir i r

+ . (

)pir pi pr ir p i rX

0 2pirX

2 2 2 2 2 2 2 2,pir p i r pi pr ir pir eX 2

,pir e

3 p i r

3 p×i×r

p SSp 1pn MSp 2 2 2 2

,pir e r pi i pr i r pn n n n

i SSi 1in MSi 2 2 2 2

,pir e r pi p ir p r in n n n

r SSr 1rn MSr

2 2 2 2,pir e i pr p ir p i rn n n n

p i SSpi 1 1p in n MSpi 2 2

,pir e r pin

p r SSpr 1 1p rn n MSpr 2 2

,pir e i prn

i r SSir 1 1i rn n MSir 2 2

,pir e p irn

p i r SSpir,e 1 1 1p i rn n n MSpir,e 2

,pir e

86

學科能力測驗非選擇題閱卷一致性之探討

p I R2ˆE G ˆ D

in rn pn

2 2 22 2 2 2 ˆ ˆ ˆˆ ˆ ˆ ˆ ,pi pr pir

pI pR pIRi r i rn n n n

222

2 2 22 22

ˆˆ

ˆ ˆˆ

,ˆ ˆ ˆ

ˆ

p

pi pr pirp

i r i r

E

n n n n 2ˆ

2 2 22 2 2

2 2 2 2 2 2 ˆ ˆ ˆˆ ˆ ˆˆ ˆ ˆ ˆ ˆ ˆ ,pi pr piri r irI R pI pR IR pIR

i r i r i r i rn n n n n n n n

22

2 2 22 2 2 2 22

ˆˆˆ ˆ

ˆ.

ˆ ˆ ˆˆ ˆ ˆˆ

p

pi pr piri r irp

i r i r i r i rn n n n n n n n

2010

87

學科能力測驗非選擇題閱卷一致性之探討

10 12

10 12 4

14 16

14 16 10 10

2 4 2 25

GENOVA Crick & Brennan, 1983

4 99 100

Brown 2004 40% 60% Brown

9

0 28

Brown

88

學科能力測驗非選擇題閱卷一致性之探討

4 99

50.3% 25.4%

8 20 100

99

90%

4

99

26.6% 32.9% 33.6% 50.3% 25.4% 61.7% 58.7% 61.0% 48.4% 71.0% 11.7% 8.4% 5.4% 1.3% 3.5%

100

27.8% 19.3% 11.7% 46.7% 23.3% 63.4% 74.1% 85.0% 51.3% 72.8% 8.8% 6.6% 3.4% 2.0% 4.0%

Pearson

r 5 100

r=0.689 0.7

Brown 2004 0.7 0.8

0.9

89

學科能力測驗非選擇題閱卷一致性之探討

5

Pearson’s r 99 .746 .846 .778 .901 .847 100 .689 .809 .706 .916 .881

6

100

99

7 8

p i r

90

學科能力測驗非選擇題閱卷一致性之探討

6

99 100

5.18 5.23 1.48 1.48

.58 .58 .16 .16

7.66 7.31 3.27 3.67

.43 .41 .18 .20

11.66 12.28 4.37 4.12

.43 .45 .16 .15

3.09 4.16 2.10 2.37

.39 .52 .26 .30

6.89 8.03 4.55 4.66

.34 .40 .23 .23

7 p i r

p

p i r 0

2

i r 0 2

p i 28%

91

學科能力測驗非選擇題閱卷一致性之探討

8

70%

2 0

i r 0

7 p i r

99 100

p .0158 19999 39.74% .0140 19999 32.09%i .0051 2 12.85% .0082 2 18.70%r .0000 1 0% .0000 1 0%p i .0112 39998 28.07% .0129 39998 29.42%p r .0015 19999 3.78% .0018 19999 4.08%i r .0000 2 0% .0000 2 0%p i r .0062 39998 15.56% .0069 39998 15.71%

8 p i r

99 100

p .0492 19999 75.23% .0594 19999 72.79% i .0007 1 1.08% .0070 1 8.58% r .0000 1 0% .0000 1 0% p i .0070 19999 10.70% .0089 19999 10.91% p r .0022 19999 3.36% .0012 19999 1.47% i r .0000 1 0% .0000 1 0% p i r .0063 19999 9.63% .0051 19999 6.25%

4 5

92

學科能力測驗非選擇題閱卷一致性之探討

2 3 99 0.71

100 0.69

.023-.050

99 3

5 100 3 6

Shavelson Webb 1991 0.80

4 99 p I R

5 100 p I R

0.71

0.74 0.76

0.78 0.76 0.79

0.81 0.82

0.69

0.73 0.75

0.77 0.74

0.78 0.80

0.81

0.60

0.65

0.70

0.75

0.80

0.85

3 4 5 6

99 pxIxR G -2

G -3

D -2

D -3

0.61

0.67

0.71

0.74 0.71

0.76

0.79

0.82

0.69

0.74

0.77 0.80

0.63

0.69

0.73 0.76

0.60

0.65

0.70

0.75

0.80

0.85

3 4 5 6

100 pxIxR

G -2

G -3

D -2

D -3

93

學科能力測驗非選擇題閱卷一致性之探討

6 7

99

0.89 100 0.90

2 1

Shavelson Webb 1991 .80

6 99 p I R

7 100 p I R

0.76

0.85

0.88

0.82

0.89

0.92

0.75

0.84

0.88

0.81

0.89

0.91

0.70

0.75

0.80

0.85

0.90

0.95

1 2 3

99 pxIxR G -1

G -2

D -1

D -2

0.80

0.88 0.91

0.83

0.90

0.93

0.73

0.84

0.88

0.76

0.86

0.90

0.70

0.75

0.80

0.85

0.90

0.95

1 2 3

100 pxIxR G -1

G -2

D -1

D -2

94

學科能力測驗非選擇題閱卷一致性之探討

10% p i 30%

p i 10%

9 99 .839

100 .866 100

.575

2008

2008

95

學科能力測驗非選擇題閱卷一致性之探討

9

99

.540 -- .431 .519

100 .478 -- .418 .575

99 .839 --

100 .866 --

p i r

0

2 1 0.80

p i r

p i

3 6

0.80

p i 28% 10%

96

學科能力測驗非選擇題閱卷一致性之探討

2 150

item response theory Rasch

2010

97

學科能力測驗非選擇題閱卷一致性之探討

2008 9

2011

2012

2010

2010 NAER

-

2008

4 161-186

2004

337 368

Brennan, R. L. (2001). Generalizability theory. New York: Springer.

Brown, G. T. L., Glasswell, K., & Harland, D. (2004). Accuracy in the scoring of writing:

Studies of reliability and validity using a New Zealand writing assessment system.

Assessing Writing, 9, 105-121.

Crick, J. E., & Brennan, R. L. (1983). Manual for GENOVA: A generalized analysis of

variance system (American College Testing Technical Bulletin No. 43). Iowa City, IA:

ACT, Inc.

Novak, J. R., Herman, J. L., & Gearhart, M. (1996). Establishing validity for

performance-based assessments: An illustration for collections of student writing. The

Journal of Educational Research, 89(4), 220-233.

Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park:

SAGE.

98

學科能力測驗非選擇題閱卷一致性之探討

A A+ A A-

B B+ B B-

C C+ C C-

40%

1.

2. 3. 4.

5.

1.

2. 3. 4.

5.

1.

2. 3.

4.

5.

20%

1.

2.

3. 4. 5.

1. 2. 3. 4. 5.

1. 2. 3. 4. 5.

20%

1. 2. 3. 4. 5.

1. 2. 3. 4. 5.

1. 2. 3. 4. 5.

20%

1. 2. 3. 4. 5.

1.

2. 3. 4. 5.

1. 2. 3. 4. 5.

2012

99

學科能力測驗非選擇題閱卷一致性之探討

5-4

3

2-1

0

5-4

3

2-1

0

5-4

3

2-1

0

2

1

0

2012