teacher’led*assessmentin*post’14*...

71
ColegauCymru Uned 7 Cae Gwyrdd Greenmeadow Springs Tongwynlais, Caerdydd CF15 7AB Ff: 029 2052 2500 E: [email protected] W: www.colegaucymru.ac.uk CollegesWales Unit 7 Cae Gwyrdd Greenmeadow Springs Tongwynlais, Cardiff CF15 7AB T: 029 2052 2500 E: [email protected] W: www.collegeswales.ac.uk Teacherled assessment in post14 education & training: A summary of the evidence March 2016 Commissioned by

Upload: vutram

Post on 25-May-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

ColegauCymru Uned 7 Cae Gwyrdd Greenmeadow Springs Tongwynlais, Caerdydd CF15 7AB Ff: 029 2052 2500 E: [email protected] W: www.colegaucymru.ac.uk

CollegesWales Unit 7 Cae Gwyrdd Greenmeadow Springs Tongwynlais, Cardiff CF15 7AB T: 029 2052 2500 E: [email protected] W: www.collegeswales.ac.uk

Teacher-­led assessment in post-­14 education & training:

A summary of the evidence

March 2016

Commissioned by

___________________________________________________________________ ColegauCymru / CollegesWales Page 2 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Abstract

This literature review was commissioned to provide an in-­depth understanding of the use of teacher-­led assessment in qualifications, post-­14 internationally. The literature surveyed from the past twenty years is voluminous and complex, covering the many different uses of assessment by teachers and others, in many countries, and across all phases of education and training. It was found difficult to isolate the later stages of education from the earlier, and formative assessment (for learning) from summative assessment (of learning). A clear pattern emerges of a gradual move over the last twenty years towards fully integrated systems of teaching and learning, assessment, evaluation and accountability. That integration allows countries to manage the interactions and tensions between those elements and the subsequent policy trade-­offs required. That pattern of development has taken place over many years, in countries that are seen to be performing well, with consistent strategy and policies. Prominent among them are Finland, Australia and, perhaps less obviously and more recently, Scotland. It is not straightforward to simply adopt aspects of other countries’ systems. However there is a clear pattern of preferences and strategic direction shared by a number of high performing countries which suggest lessons to be learned. Introduction This project was commissioned by Qualifications Wales to contribute to the evaluation and development of assessment policy in Wales. The focus of the paper is on assessment in schools and colleges post-­14, in particular on teacher led assessment (TLA) and its potential contribution to summative assessment for national qualifications. Countries with good practice have been identified by the characteristics of their systems and their reputation reported in the literature, and also in part by their relative performance, especially indicated by PISA results. Although PISA is a pre-­16 test it is nevertheless seen as evidence of an effective education system as a whole, and data are readily available for participating countries. In the context of its contribution to national qualifications TLA has often been subject to claims that it ‘lacks rigour’, and recent changes in England post-­14 are assigning greater importance to external, terminal examinations. Being mindful of these issues and also of the possible future direction of education in line with the recommendations of Professor Donaldson’s Review of the Curriculum in Wales in which he recommends that teacher assessment remains the “main vehicle for assessment before qualifications, and the frequency of external, standardised testing be limited in view of its impact on the curriculum and teaching”, it is an

___________________________________________________________________ ColegauCymru / CollegesWales Page 3 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

opportune time to be encouraging good quality TLA;; exploring its benefits and challenges and seeking to learn from the experiences of others. Background Assessment at one level is simply the process of making judgements about the quality and extent of learning. However the literature is extensive and complex due to the large number of purposes to which assessment is put, ways in which those judgements are made and by whom, and how they are used and interpreted. The term ‘teacher-­led assessment’ (TLA) is used throughout this document to refer to any assessment activity that is carried out by a teacher or other practitioner who is also involved in supporting candidates’ learning, and that forms a component of a qualification awarded by a recognised awarding body. Given that interpretation of TLA, summative assessment or the assessment of learning (AoL), becomes the prime focus. That is used to summarise what pupils know, understand or can do in order to report achievement and progress, usually in the context of achieving qualifications. Although formative assessment, often described as assessment for learning (AfL) is of secondary interest in this remit it is very clear from the literature that formative and summative interact in important, complex and subtle ways. Teachers’ informal observations or notes on learners’ work can and should contribute to their summative judgements. Developing teachers’ skills in formative assessment is essential for and will certainly help them contribute to summative assessment. Summative assessment when used in high-­stakes contexts to evaluate and rank learners, teachers, departments, institutions, local authorities and indeed nations inevitably affect the design of systems, the way teachers and learners behave within them and their use of formative assessment. Finally the role of formative assessment cannot be ignored in the improvement of learning;; it is perhaps the crucial element. Due to the volume of work available and the repetition within it of common themes and arguments this literature search has been limited to work published after 1996. Purpose

To provide a broad and in depth understanding of research studies into teacher-­led assessment in qualifications to inform a wider thematic review conducted by Qualifications Wales.

___________________________________________________________________ ColegauCymru / CollegesWales Page 4 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Objectives

To undertake and compile a desk-­based literature review of research about teacher-­led assessment. To produce a report for Qualifications Wales.

Lines of Inquiry

The perceived position and role of teacher-­led assessment within formal qualifications in the UK including examples from other parts of the world that effectively utilise teacher led assessment, with particular reference to Scotland.

Changes in perception over time. Benefits to be derived from the use of teacher led assessment. Risks and challenges associated with TLA. Examples of good practice that emerge from the research studies. Suggestions for the development or propagation of good practice or for improvement of existing practice with examples.

Broad findings in the research related to the fitness of purpose of the assessment instruments used in UK qualifications. Are there other notable examples from other countries?

Implications for awarding bodies in terms of assessment design and quality assurance.

Implications for regulators from an assessment development and qualification regulation perspective that emerge from the research studies?

Where possible these are reported on in individual sections. Some are integrated with others and some are covered in the main body of the text.

___________________________________________________________________ ColegauCymru / CollegesWales Page 5 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Sources The databases interrogated as the basis for this desk research include:

The British Educational Index (through EBSCO) Google Scholar The OECD Educational database Social Science Research Network (SSRN) The UCL Institute of Education Library Proquest Educational Journals

Other sources include the City and Guilds, Oxford University, Scottish Government, Welsh Assembly Government, OECD reports on the educational systems of member countries, government and awarding body web sites and the web sites of some major national media organisations. Major bodies of work include: The Assessment Reform Group (ARG) whose aim has been to ensure that assessment policy and practice at all levels takes account of relevant research evidence. A voluntary group of researchers brought together in 1989, funded by the Nuffield Foundation and dissolved in 2010. This work is acknowledged by the Scottish government as being a major influence. The project Assessment Systems for the Future, funded by the Nuffield Foundation. The project was set up by the Assessment Reform Group in September 2003 to consider evidence from research and practice about the summative assessment of school pupils, and to propose ways in which such assessment can benefit their education. There are a number of seminal papers in the work of that group, including 3 pieces of meta-­research that have had a major effect on thinking in the profession and in policy development in many countries. The Teaching and Learning Research Programme (TLRP) The largest ESRC programme which includes research on all aspects of education from pre-­school to adult and WBL. The Scottish Qualifications Authority (SQA) has commissioned a number of working papers which contain a useful summary of their policy development in the last 20 years and the source of some of their thinking. The literature review commissioned by them in 2006 by Rob Van Krienken is especially useful as a summary. The OECD Education Working Papers and OECD Indicators for comparative international Information.

___________________________________________________________________ ColegauCymru / CollegesWales Page 6 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

QCA papers, in particular Stanley et al 2009;; a wide ranging literature and international survey. Cambridge Assessment Research Reports, in particular Gill and Benton (2013). The NFER, in particular ‘Developing a National Monitoring System’, 2009.

Methodology

i. Submission against remit, selection ii. Internal discussion of remit and identification of scope and conceptual

issues iii. Initial meeting of ColegauCymru with DfES and Qualifications Wales iv. Initial literature scan by key words and phrases: teacher-­led

assessment;; teacher led assessment, teacher assessment, teachers’ assessment, formative, summative, continuous;; terminal, internal;; external, NEA,

v. Wide reading and selection of relevant articles and journals;; reading and secondary ranking and selection;;

vi. Production of emergent literature summary and report narrative, internally shared

vii. Interim meeting to discuss progress and emerging issues for clarity, time scales budgets etc.

viii. Identification of good practice internationally and in awarding bodies. ix. Production and review of initial draft x. Final draft paper, reviewed and edited. xi. Final draft to QW.

___________________________________________________________________ ColegauCymru / CollegesWales Page 7 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

An overview of the literature This section provides a summary of the literature of assessment over the last twenty years. That body of work is extensive and complex, and covers all ages, all phases and the systems of many different countries. The focus of much of the literature is on school systems, in particular the junior and lower secondary phase. Assessment is a complex area: a fundamental part of high quality learning, of central relevance to all ages, phases and systems, and highly sensitive to the press and policy makers. Because of its complexity, the rather variable quality of the literature and political sensitivities, there are no obvious best ways to do things which emerge, and certainly no silver bullets for policy. However there are dominant directions and pointers in the literature, and a clear pattern of development in some high performing countries. Assessment must first and foremost promote high quality learning for all. High quality learning is more important than assessment, so any assessment practice which interferes with it should be changed. A simplistic, ideal-­type analysis often poses two extremes of formative and summative assessment. Formative Assessment: For learning only, teacher-­led, continuous, informal, personal to the learner, low stakes. This is fundamentally important to the quality of learning but has been given little attention by policy makers until recently, and some commentators have complained that lip service rather than real commitment is paid to it by policy makers.

Statements of policy (in England) ...have emphasised the importance of formative assessment ... however the available resources have been concentrated on the tests... external testing of teachers and schools to promote competition through league tables had a higher priority”. (Black 2001)

Black goes on to say that it is no wonder that formative assessment is “seriously in need of development”. Summative Assessment: For reporting and progression only, external, terminal, formal, public, high stakes. Not fundamental to learning itself, and very high stakes when used as the basis for teacher and institutional evaluation.

___________________________________________________________________ ColegauCymru / CollegesWales Page 8 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

This view is too simplistic and combines a number of dimensions, including the purpose, timing, locus, and use of the resulting data. That complexity is one of the sources of confusion in the literature and debate. Formative and summative assessment are seen as dichotomous in most of the earlier literature but in reality they are intimately linked;; arguably they are the same thing being used for different purposes. Seeing them in that way helps to clarify how teacher led assessment (TLA) can be an integral part of summative assessment, how the formative element can contribute to the summative, and vice versa.

There seems to be value in maintaining the distinction between formative and summative purposes of assessment while seeking synergy in relation to (their use). (Harlen 2005)

That in turn highlights the fact that high quality teacher contributions to summative assessment must be accompanied by high quality formative assessment skills. Those development projects which have focussed on improving TLA have shown that the skills required for both formative and summative assessment can be significantly improved by appropriate CPD. These include:

designing effective learning and assessment tasks which: o validly represent the specifications;; o are realistic, engaging, practicable;; o encourage problem solving and innovation;; o elicit evidence about the achievement and the standards being assessed;;

o indicate changes to be made to improve understanding and application;;

organising group and individual work;; dialogue with learners about the task, and their engagement and development;;

the use of learner involvement and peer to peer practice to deepen understanding and commitment;;

assessing individual contributions to group work;; thorough understanding of the standards set;; reliable judgement of standards and achievement;; aggregation of the individual to the general;; contextualising judgements in a broader knowledge of learners and their circumstances. (Atkin JM 20077, Black and Wiliam 2001, Harlen W 2005, McMahon and Jones, 2015)

___________________________________________________________________ ColegauCymru / CollegesWales Page 9 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Clearly the development of both formative and summative skills need to go hand in hand. Approached that way, through a clear curriculum, delivery and assessment model one can see the emergence of a powerful vision for effective and high quality teaching and learning (HQTL):

There is also a vision of teaching ... characterised partly as one in which teachers and students work together to gauge levels of attainment and also collaborate to bridge the gap between current levels of understanding and those that are expected. (Atkin MJ 2007)

TLA in the context of formative assessment within teaching and learning is not seen as problematical;; here the limited emphasis in the literature has been on improving its quality, although that in itself is not straightforward. In recent years attention has been drawn to the absolute importance of teacher feedback to learner achievement. In his extensive statistical meta-­analysis of what works to improve learning Hattie states with considerable authority:

the most powerful single moderator that enhances achievement is (high quality teacher) feedback (Hattie, J 2009).

Traditionally, in most countries TLA in summative assessment has been seen as too unreliable to be allowed to dominate the assessment system. That is less the case in the lower years of schooling, and in the many general and vocationally specific qualifications where it is much more common. International trends are difficult to track and articulate, and even in the UK there has been no long trend in one particular direction. Different governments in England, with different political values, have had what appear to be fundamentally different views of the purpose of assessment and policy has changed accordingly, especially since 1991 in England. Having said that, in most countries there has been a shift in the last thirty years towards using more TLA in summative contexts and it is now used extensively in many countries. Whatever the long-­term trend, summative assessment in most countries has become a mixture of TLA and external testing in varying degrees. Even in Finland and Queensland where there is almost complete dependence on TLA there are separate university entrance tests (Finland) for some subjects, and separate school leaving examinations used to moderate TLA (Queensland). The debate therefore is centred on what is the most appropriate mix of TLA and external testing, the uses to which that is put and whether or not the advantages of both can be gained whilst mitigating their shortcomings.

___________________________________________________________________ ColegauCymru / CollegesWales Page 10 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Any assessment system needs curriculum content to be agreed and clearly articulated, partly because that improves the reliability of both formative and summative assessment, and also because the national consensus on priorities needs to be explicit and understood. There is no doubt that in recent years in the context of globalisation and economic competitiveness comparative international tests such as PISA and TIMSS have created a lot of interest in governments, especially in those countries performing less well. There is now clear recognition that achievement must be for everyone. There is more emphasis on learning for the future, debate about the relative importance of skills and content, and about the nature of high quality learning needed for a high-­skills economy. The debate on the outcomes needed, and the move to specifying content in terms of outcomes, with clear standards in criterion based systems has been helpful to bring clarity and to guide assessment systems. Even so, if the desired outcomes are clear but the assessment regime itself is not optimal then learning will also be less than desired. The optimal mix

All assessment needs to be:

Valid (it assesses those things it is supposed to assess, and all of them) Reliable (accurate, consistent) Supportive of high quality teaching and learning (HQTL) Practicable (do-­able and affordable, in the context of its value)

Most of the literature places more emphasis on validity and reliability than the other two above, and reliability has almost always been assumed to be the higher order priority. The mix of TLA and external tests used is determined to a large extent by the perceived and real strengths of each within these criteria. External tests have been seen to be less valid but more reliable than TLA. Thus the debate has sometimes been presented as a crude choice between validity and reliability. In the past emphasis on reliability through external testing has dominated systems, possibly due to the higher profile of reporting of outcomes for accountability purposes mentioned above, and to relatively low cost. This debate is far more complex than a crude choice of one or the other;; there are serious issues with both, including the following.

External tests The view that external tests are more reliable and therefore preferable to TLA has been challenged in the literature on a number of grounds, including:

___________________________________________________________________ ColegauCymru / CollegesWales Page 11 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

o common examples where tests have been found to be marked inaccurately;;

o evidence of low reliability of TLA being taken from systems where little support and development have been provided;;

o recent growth in challenges to marking accuracy through formal appeals of test results;;

o a number of high profile issues involving apparent political influence on the use of tests the view that when support is provided;;

o evidence that TLA can be improved and the issues around its use can be significantly reduced;;

o some attempts to make tests more reliable such as tightly controlled scope, multiple-­choice, the highly detailed specification of standards, tick-­box assessment and so on can be shown to both reduce their validity and the quality of learning;;

o the view that the lack of validity in tests is more significant than often thought, sometime amounting to a very narrow sample of the desired outcomes.

Testing has also been shown to have serious detrimental effects on learning and the learning experience in some circumstances, especially with the high stakes use of data. TLA

TLA is potentially more supportive of high quality teaching and learning (HQTL) and done well can have dramatic effects on those, but the use to which assessment data is put can have a strong effect on its use in the classroom and in terminal tests, especially in a high stakes context.

Effective QA processes such as external moderation and the socialisation of teachers into the standards set can address issues around reliability, but they are expensive and slow to take effect. There are also issues around cost, teacher time and potential trade union resistance.

TLA is often seen to be most open to distorting influences. The literature does not bear that out. Both TLA and external examinations are subject to distortion, especially in high stakes contexts. High stakes distort all kinds of assessment, together with the learning process and outcomes that accompany it.

The following point from just one wide review of the literature points to the evidence and choices to be made:

___________________________________________________________________ ColegauCymru / CollegesWales Page 12 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

The findings of (this) review by no means constitute a ringing endorsement of teachers’ assessment;; there was evidence of low reliability and bias in teachers’ judgements made in certain circumstances. However, this has to be considered against the low validity and lower than generally assumed reliability of external tests*. (Harlen, W, 2005)

*Together with, it should be pointed out, other negative and unplanned consequences from the use of examinations in high stakes contexts outlined below. Rather than trying to determine which of the two should be dominant, a strong strand in the literature suggests that it is more useful to ask whether or not the two can be combined to get the best of both worlds. Black develops that theme through the assertion that if we can achieve validity in summative testing through the development and use of effective TLA we can approach that ideal:

Validity in summative assessments requires two revolutions. One is ... developing the skills and procedures used in the year on year summative assessments by teachers and their schools ... the other is to persuade (policy makers) that their reliance on short formal tests is based on ignorance and is deeply harmful to students. (Black 2013) The required revolutions can be achieved and have been achieved in some state systems. (Black 2013)

The countries selected as examples of good and emerging good practice in this report all rely heavily on TLA, albeit in different proportions. Their good practice and high performance cannot be related to that alone however. To ensure high quality teaching, learning and assessment requires consistent, long-­term policy with a commitment to constant development, supported by substantial investment in teacher skills, QA arrangements and raising public awareness. Countries which have pursued those policies consistently over time include Finland, Australia, New Zealand, Canada, Sweden and Scotland, amongst others. The performance of their systems appears to show significant benefits from the approach outlined, although care needs to be exercised in assuming causality. Gill and Benton, in a study of the relationship between high PISA performance and assessment systems, conclude as follows:

“Overall this report highlights the difficulties involved in relating performance in international tests to specific aspects of educational systems. While it is tempting to examine the characteristics of education systems in high performing jurisdictions and hope that translating the systems from these countries will also lead to improved performance, such

___________________________________________________________________ ColegauCymru / CollegesWales Page 13 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

an approach ignores two crucial issues;; whether candidates in one country have anything in common with candidates in our own country, and whether the highlighted aspects of the educational system are unique to high performing countries... It is only when we take account of all of the data and adjust for other influential factors that we can get a true picture of the influence of a system level variable;; albeit an inconclusive one. (Gill and Benton 2013)

Finland is particularly successful in economic terms for a small nation, and in terms of its PISA performance. To what extent that is a result of their educational policy as opposed to the nature of the country itself, its historical homogeneity and its strong commitment to social justice and high levels of equality remains a moot point;; it is likely to be a mixture of all of those factors, and a two-­way process. Australia has a diverse approach but all its systems share these characteristics, and exhibit a balance of controls that minimise the effects of high stakes. Scotland has also developed over the last twenty years a coherent approach to HQTL and assessment, exemplified by the more recent Curriculum for Excellence, which includes the extensive use of TLA with positive results. What the literature suggests is not so much copying the systems of successful countries but identifying the common themes and direction of travel and seeking to learn from them. The main themes and direction of travel that emerge from both academic work and case studies of high-­performing countries, include the following. Firstly, put HQTL at the forefront and view assessment simply as a means to that end. Given that, there is an argument to do the following:

Articulate and sustain a long-­term and clear policy direction alongside clarity of purpose and priorities.

Specify the purpose of education and national priorities clearly and express agreed content as learning objectives with clear specifications and standards. That will help both formative and summative assessment and articulate national priorities.

Develop in policy makers a good awareness of the multiple purposes of assessment (supporting learning, reporting results, accountability (for: learners, teachers, institutions, local authorities et al), and the need to combine and use those in ways which are fit for purpose, mutually reinforcing and with minimised unexpected consequences.

Maximise TLA in summative assessment whilst not relying on it entirely. At the same time address its potential weaknesses through long term CPD to strengthen teacher skills, quality assurance and collaborative working.

___________________________________________________________________ ColegauCymru / CollegesWales Page 14 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Balance professional autonomy with effective standardisation and moderation.

Minimise the high stakes component in the way that assessment data are used for evaluation (whilst accepting the need for reporting and accountability), through the use of alternatives or supplementary information such as: standardisation tests;; national sampling;; self assessment;; value added data;; learner perceptions.

Use secondary data such as value added, national benchmarking, local benchmarking, and clustering institutions to support self-­assessment and external evaluation, and inform development.

Develop a high trust and supportive culture (a ‘no blame culture’) whilst still ensuring effective accountability. This is another area that requires long term CPD.

Invest enough and commit to long-­term development in pursuit of those ends, perhaps in close partnership with other nations with similar values, such as Finland and Scotland. Do not expect rapid change or early success.

Include teachers in development and trust teachers’ professionalism. Include learners in their own assessment. Invest in low cost and flexible solutions.

___________________________________________________________________ ColegauCymru / CollegesWales Page 15 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Lines of Inquiry: summaries of findings The perceived position and role of teacher-­led assessment within formal qualifications in the UK including examples from other parts of the world that effectively utilise teacher-­led assessment, with particular reference to Scotland. Changes in perception over time. Much of this line of enquiry is covered in the section on good practice in other countries, including Scotland, Finland, New South Wales, Queensland and South Korea. Changes in perception over time are difficult to identify and codify as there are few discernible patterns either internationally or within individual countries. The strongest pattern is to be seen in those countries with good practice listed above;; they have moved to integrated systems of assessment and evaluation of systems over time, through iterative, evidence-­based development. In the post-­war period, certainly in Western Europe, it was considered inappropriate for governments to be involved in education policy to any significant extent;; memories of pre-­war authoritarian states and their misuse of education policy were fresh;; education policy and practice was delegated to local authorities and institutions. That began to change in the 1960s in the pursuit of higher standards, greater social justice and national consistency. In the UK the development of the National Curriculum in the 1980s introduced a period of centrally determined and highly specified content for schools, with national key stage tests in England, Wales and Northern Ireland. GCEs, later GCSEs and A Levels, were the dominant school examinations and were at first largely externally and terminally tested and norm referenced, with little teacher input. Scotland has always had a greater degree of autonomy than Wales or Northern Ireland, not least in education policy. The growing involvement of central governments has highlighted the need for good governance which separates setting a strategic vision and triggering and steering education reform, from regulation and implementation. The elements of good governance and regulation are commented on from page 52 below.

___________________________________________________________________ ColegauCymru / CollegesWales Page 16 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

BENEFITS TO BE DERIVED FROM THE USE OF TEACHER LED ASSESSMENT

The literature argues consistently that the only practical vision for deep and meaningful learning, (referred to as HQTL in this report) is one where formative and summative assessment are intimately linked, where TLA dominates but is not exclusive. There is clear recognition that will be difficult and slow to achieve but also clear evidence that it can be achieved in the right circumstances. Dependability is used by some in the literature to combine both validity and reliability. That use emphasises that those two uses of assessment are closely linked in practice;; neither is perfect;; and there are clear trade-­offs in policy. For example, if a test is inexpensive and reliable because it constrains and controls what is tested it may not only be less valid as a test of the desired learning but it may also damage that learning in the process and will thus have low dependability. Policy should be aiming to achieve the best of both. In terms of Validity

TLA is seen as more valid than external testing as teachers can assess a wider range of (often fundamentally important) achievement and learning outcomes than formal tests and examinations, including thinking skills, practical abilities, creativity and personal values.

TLA is better at developing and using more authentic (and therefore useful and motivating) assessments due to the time available and the greater flexibility in approach.

Teachers have access to the full range of information from each learner over the whole of their development and can contextualise their judgements within that knowledge.

Teachers’ assessment can provide information about learning processes as well as outcomes.

Freedom from test anxiety and from practice in test-­taking can mean that assessment by teachers gives a more valid indication of pupils’ achievement.

Reliability

With appropriate training and moderation teachers’ assessment can reach very satisfactory levels of measured reliability, and standardisation can control it further.

There are many examples in the UK, in other countries and in further and higher education of teachers making crucial summative assessments of pupils’ performance reliably.

___________________________________________________________________ ColegauCymru / CollegesWales Page 17 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Impact

The process of improving TLA can bring about significant improvements in teaching skills and in students’ learning.

When teachers are gathering evidence from pupils’ on-­going work, information can be used formatively, to help learning, as well as for summative purposes.

Teachers have greater freedom to pursue learning goals in ways that suit their pupils and individualise their approaches.

Freedom from testing can reduce the effects of teaching to the test, narrowing the curriculum and damaging the morale and self-­esteem of learners with lower confidence and academic ability.

There is significantly greater potential for positive impacts on teaching and learning.

The necessary accompanying standardisation and moderation procedures provide valuable professional development for teachers.

Pupils can and should share in the process through self-­assessment and derive a sense of progress towards “learning goals” as distinct from “performance goals”, which can damage their perception of HQTL.

Practicability

Although it is relatively expensive compared to external testing, if it is seen as a central part of improving HQTL it can be argued to be worthwhile.

It has been shown to improve the quality of teaching and learning;; countries with a large element of TLA tend to perform well in international tests.

It is within the control and development reach of institutions and support agencies and some costs can be seen as internal to them.

Financial resources can be released at the school level by reducing the number of commercial tests and products purchased.

___________________________________________________________________ ColegauCymru / CollegesWales Page 18 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

RISKS AND CHALLENGES ASSOCIATED WITH TLA In terms of Validity

The validity of teachers’ assessment depends on the learning activities and opportunities that an institution provides, so it is important for awarding or regulatory bodies to provide support for learning through high-­quality resources.

Bureaucratic moderation procedures for quality assurance could constrain the operation of teachers’ summative assessment so that only “safe” and routine approaches are used.

Reliability

Teachers’ assessment is often perceived as being, and indeed can be, unreliable in terms of consistency between teachers, over time and when compared with some tests, but this risk can be minimised through CPD and standardisation and moderation.

TLA can be subject to bias in terms of gender, race and behaviour although this does not emerge as a major problem in the literature and can be minimised by CPD or by control devices such as blind marking (of external tests, and in CPD exercises to identify issues).

More importantly it tends to be dominated by any external tests in use, and it is certainly subject to distortion in high stakes applications.

Some external tests or tasks may still be needed to supplement teachers’ judgments, moderate results, control progression, or convince the public of the system’s robustness.

Impact

Public confidence in the system may be low due to teachers’ assessment being perceived as inferior to external tests, particularly for children aged 11 and over.

However, concern appears to be largely located in the press and some policy makers;; there is little evidence of serious concern from employers or parents at those times, in those systems and countries that use TLA extensively.

Teachers can spend more time teaching rather than preparing for and marking tests, and learning time can be increased significantly (some estimates say by at least two weeks per year) by using classroom work rather than tests to assess progress.

___________________________________________________________________ ColegauCymru / CollegesWales Page 19 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Practicability

Responsibility for summative assessment can increase the workload for schools and teachers.

Teachers can find that the process of moderating their judgments is time-­consuming.

Long term training in the interpretation and use of assessment criteria, making judgements and including learners is needed.

There are examples of the successful combination of internally and externally based assessments for awarding certificates at school leaving, including in Australia and Scotland, in most QVQs and many others.

___________________________________________________________________ ColegauCymru / CollegesWales Page 20 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Reference to good practice and issues/recommendations from the academic literature, Scotland, Finland, Australia and others 1. Scotland as good practice

Scotland has long had a considerable degree of autonomy within the UK in education policy and other areas of government and the law. It is not because of its international performance that Scotland has been identified as having good practice in this report. Although Scotland could claim to be the strongest UK nation as measured by the 2012 PISA tests the difference is not stark. It scored 4 above England in Maths, 6 above in Reading but 3 below in Science. Many within Scottish education play down the significance of the PISA test: it does not directly cover learners over 15 and it tests very specific skills, paying no attention to the concept of deeper learning and understanding. The belief is that deeper learning -­ truly understanding a subject rather than just giving learners facts or preparing for tests -­ equips them better for adult life: the world of work or later study. It is not possible to separate the development of the curriculum and HQTL from the development of the assessment and evaluation system;; they have gone hand in hand. The OECD have recently stated that:

(Scotland) now has the opportunity to lead the world in developing an integrated assessment and evaluation framework. OECD 2015, p.13.

By that they mean an integrated system which emphasises and develops assessment principally as a crucial part of HQTL, but also to the evaluation and therefore accountability of teachers, leaders and managers, institutions, local authorities and the system itself. They refer to all of the bodies with oversight duties and to a large extent the state Inspectorate for teaching institutions and local authorities. As the OECD elsewhere identifies:

“Those involved need to understand how one use may impact on other uses in all parts of the framework. “Higher-­‐stakes assessment does not have to be at odds with formative assessment, but international research shows that ‘there is a risk that pressures for summative scores may undermine effective formative assessment practices in the classroom … Such tensions between formative and summative assessment need to be recognised and addressed. (OECD, 2013: 215)

Their recommendation in this section is:

___________________________________________________________________ ColegauCymru / CollegesWales Page 21 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

“Develop an integrating framework for assessment and evaluation that encompasses all system levels” It is important to have a coherent and carefully designed framework in order to maximise the quality of the information, to ensure that particular evidence sources are fit for the intended purpose, and to minimise unintended consequences such as reducing rather than promoting teachers’ assessment capacities. ... This means integrating processes and systems for learner assessment, teacher and leader appraisal, school evaluation together with local authority and national activities and policies. These should be driven by norms of collective responsibility and mutual accountability through processes of genuine inquiry. (OECD 2015, pg. 123)

Recent changes

Scotland has adopted a long term development strategy over the last 20 years, with “Curriculum for Excellence” (CFE) introduced in 2010. Prior to that other development work was paving the way;; the Assessment is for Learning (AiFL) project sought to develop a coherent assessment system, with attention to assessment for both formative and summative purposes. That was described in 2003 as a quiet revolution in Scottish education by the then Education Minister. All this has resulted in very significant change over that time with a growing element of TLA. Education Scotland describes the aim of Curriculum for Excellence as:

to achieve a transformation in education in Scotland by providing a coherent, more flexible and enriched curriculum from 3 to 18. The curriculum includes the totality of experiences which are planned for children and young people through their education, wherever they are being educated. The purpose of the curriculum is encapsulated in the four capacities -­ to enable each child or young person to be a successful learner, a confident individual, a responsible citizen and an effective contributor.

The extent to which developments in Scotland follow the strong consensus in the literature (to which the SQA are significant contributors), is marked;; if the literature is pointing in the right direction at all those developments should make a difference for the good. The clear, emerging international consensus in the literature and the current system in Scotland share the same major characteristics:

___________________________________________________________________ ColegauCymru / CollegesWales Page 22 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

1) high quality learning comes first, with an emphasis on breadth, deep learning and skills;;

2) long term development with a consistent policy strategy, well informed by research;;

3) an emphasis on assessment for learning more than assessment of learning;;

4) respect for the professional autonomy of teachers and investment to build it;;

5) investment in the formative skills of teachers as an integral part of HQTL;; 6) open and developing systems of moderation and QA through partnership

between government, institutions, local authorities and the inspectorate;; 7) standardisation and development through collaboration within and across

institutions;; 8) the provision of national resources in teaching and assessment materials;; 9) assessment models fit for purpose across different phases;; 10) an emphasis on QA, self assessment and survey reporting for

accountability and development;; 11) a flexible approach to age and level;; for example the freedom to skip

Nationals and go direct to Highers in some subjects;; and, 12) minimised use of assessment data for teacher and institutional

performance through the use of sampled testing, moderation, and a survey approach to national monitoring which reduce high stakes and limits teaching to the test.

Finally, although this is not addressed in the literature, it is clear that developments in Scotland have been accompanied by a consistent and well integrated narrative to explain policy to the press and the public and focus on central rather than peripheral issues.

The Scottish Government’s strategic vision for assessment and the details of its approach are included in Appendix 1. The Senior Phase in Scotland

The senior Phase in Scotland is the equivalent of post-­14 provision in Wales and covers years S4, (N4 and 5 qualifications) S5 and S6, (Higher and Advanced Higher qualifications), (the correspondence of years and qualifications is currently being relaxed). The examination board responsible for Scottish Qualifications is the Scottish Qualifications Authority (SQA). National 4 and 5 qualifications are the equivalent of GCSEs at Levels 1 and 2. National 4s are 100% teacher lead assessment.

___________________________________________________________________ ColegauCymru / CollegesWales Page 23 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

National 5s contain a formal external examination. They are designed to move away from recall towards more understanding and application. Typically fewer are taken than the old Standards, a reduction from 8 to 6 being typical and that has been controversial to some extent. This time saved is freed up for activities such as work experience and community work. N5s can be skipped if learners and teachers feel that moving directly to Highers is more appropriate. Highers and Advanced Highers are the Scottish equivalents of A-­Levels. Highers are considered the same as AS levels, Advanced Highers are equivalent to (or some believe) slightly harder than A2 Levels. In the last two years new Highers have been introduced to:

update content and reflect the focus on knowledge and skills. The aim is to provide a smooth learner journey from 3 to 18 and, as the curriculum and learning and teaching approaches have changed, so the qualifications needed to change too. "The new Highers have been developed to provide smooth progression from the new National 5 qualifications and will provide good progression into the new Advanced Highers. The new qualifications use a broader range of approaches to assessment to better capture candidates’ ability to apply their knowledge and skills. For example, the new science Highers now include an assignment as well as an examination where candidates research a topic of interest to them and write up a report to outline their findings. The examination and the assignment are both marked by SQA and contribute to the final grade the candidate receives. The skills being assessed in the assignment are vital for progression to HE and employment within the STEM area. We all want to give Scotland's young people the best possible chance when leaving school or college -­ whether they are planning on progressing to college, university, training or employment. (Angela Constance, Scottish Education Secretary, ITV Scotland Web site, Aug 2015)

The requirement for entry into most Scottish universities are Highers;; however for entry into English universities, applicants are expected to present Advanced Highers. Oxford University has, for example, published its policy that successful candidates will be expected to get 2 A’s and a B at Advanced Higher. Advanced Highers usually require learners to have passed;; grades A or B (exceptionally a C) in Higher to allows them to sit the Advanced Higher of that given subject. Learners can take up a subject at Higher level even though they

___________________________________________________________________ ColegauCymru / CollegesWales Page 24 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

have not done Standard Grade/Intermediate at that level, dependent upon the school and the teacher of the given subject. Requirements vary between schools. Assessment

In order to qualify to sit the final exam, a pass is required at the National Assessment Bank (NAB) end-­of-­unit tests of which there are usually 3 in each subject. Some subjects also have a NEA percentage mark. Grading is based on standard boundaries that vary depending on the subject. They are subject to change each year depending on the results obtained by pupils that year and the difficulty of the paper but rarely change more than 5% up or down. The current boundaries are:

A -­ 70% B -­ 60% C -­ 50% D -­ 45% N/A -­ <45%

Advanced Highers Most subjects in Advanced Higher require a significant proportion of TLA/NEA such as portfolios, internal tests or project work. For example: Advanced Higher English 70/30 40% written dissertation (personal study/comparison on literature) 30% either Creative Writing folio OR a textual analysis essay (externally assessed) 30% a critical essay written during the final exam Advanced Higher Mathematics/Applied Mathematics 100% 100% external examination Advanced Higher Modern Languages 60/40 15% based on Folio work 25% speaking examination conducted by visiting examiner 60% examination Advanced Higher Physics, Biology & Chemistry 20/80 20% based on a project 80% external examination

___________________________________________________________________ ColegauCymru / CollegesWales Page 25 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

An important feature of good practice in Scotland is its survey approach to national monitoring which reduces the high stakes element of assessment and is therefore more fit for purpose.

This offers the potential to provide information on pupil achievement on what matters across the curriculum;; and, because individual teachers and schools are not identified, teachers are not threatened by the survey and therefore are not tempted to narrow the curriculum or teach to the test. The decision in Scotland to stop collecting national assessment results for all pupils in all schools was taken because policy-­makers recognised that teachers were teaching to the tests and that, although it appeared that results were improving, it was more likely to be that teachers were getting better at rehearsing children for the tests. (Mansell J, 2009)

Public understanding and acceptance There is little doubt that the radical changes in education in Scotland in recent years have enjoyed broad support across the political spectrum and that is a sign of good practice in itself. What follows is taken from significant national media websites in August 2015 after the first results of the new Highers were released. This is not serious evaluation of course, but it does show that the same recurrent concerns are expressed in Scotland as elsewhere in the UK, but there has been no serious outcry, no diversion from policy and no comment at all to be found on TLA. To the extent that comment in the national press is a guide it shows that there are predictable concerns, but broad understanding and acceptance of recent changes.

Peter Macmahon, ITV Political Editor (Aug 2015) wrote:

According to the Scottish Qualifications Authority (SQA) the new exam involves "assessment of different skills and knowledge, for example, deeper learning and higher order thinking skills". Make of that what you will. But for the pupils who sat the new maths exam the differences were not theoretical but real. Many complained it was much harder than the old Higher maths. Following complaints from pupils and teachers the SQA admitted "the assessment proved more demanding than intended and therefore the grade boundaries were reduced".

In other words, for a lower absolute mark compared to the old exam, pupils will in some cases get a higher grade. As the SQA puts it: “This ensured that candidates received the grades they deserved”.

___________________________________________________________________ ColegauCymru / CollegesWales Page 26 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

All this is perfectly normal, the SQA and ministers say. But today's results still show a difference in pass rates for the old and new maths Higher. 10,854 entries for existing maths Higher -­ 73.1% got A to C grades and 10,220 entries for new maths Higher -­ 70.8% got A to C grades.

There has been no outrage over this, but it would be fair to say eyebrows have been raised. Education secretary Angela Constance, meanwhile, has promised that the introduction of the new exams will continue to be monitored and any lessons which need to be learned will be. There are always teething troubles in any new system, ministers say, but overall CfE is working well for those who matter most -­ Scotland's young people.

Alongside concern about Maths pass rates falling was wider concern about ‘falling standards’ as pass rates rose overall. For example Josh Halliday (Guardian August 2015) reported as follows:

“A record number of Scottish teenagers have passed their Higher exams, prompting calls for SNP ministers to investigate whether the qualifications have been made easier. Overall, there were 156,000 passes – a 5.5% increase on 2014 – after the new-­look exams were introduced this year as part of an overhaul of Scottish education called Curriculum for Excellence. Liz Smith, the Scottish Conservative young people spokeswoman, said the biggest challenge in education was the “significant attainment gap between pupils from poorer and wealthier backgrounds”. She added: “The Scottish government has made great play in recent months about exam marking becoming ever more rigorous, yet, in English, at a time when there are concerns about literacy skills amongst school leavers, we learn that the English Higher pass rate has increased hugely.”

Simon Johnson, Scottish Political Editor, Daily Telegraph Aug 2015 reported:

SNP ministers are under pressure to investigate whether Scotland’s Higher exam has been made easier after record-­breaking results showed a greater pass rate among pupils who sat a new version of the qualification. Children in some schools sat more than 100,000 of the revamped Highers, which have been introduced this year, with 79.1 per cent of the exams being graded at levels A to C. But pupils in other schools sat 92,555 of the old qualifications because their teachers were not ready to introduce the new version. They recorded an overall pass rate of 76.7 per cent, down slightly on last year’s figure of 77.1 per cent. The total passes for Higher English spiralled by an astonishing 17.7 per cent compared to last year,

___________________________________________________________________ ColegauCymru / CollegesWales Page 27 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

when all pupils sat the old version, while the figure for modern languages increased by 15.2 per cent. The Scottish Government said that the results for the new and old exams were not directly comparable as different “cohorts” of pupils sat each. But the record results – which were expected to trigger a scramble for university places – were achieved despite research showing literacy standards have fallen at all years measured in Scotland’s primary and secondary schools. A Scottish Government spokesman said: “As Scotland’s chief examining officer has made clear, comparing pass rates between the new and the existing Highers is extremely complex. Each Higher will have been studied by a different group of learners and candidates will have come to Higher study through a variety of routes.” “All of SQA’s processes, involving thousands of teachers in the setting and marking of exams, are designed to ensure that standards are consistent from year to year.” The Educational Institute of Scotland, the country’s largest teaching union, praised its members for delivering the results in the face of an “excessive” workload and “perceived over-­assessment”.

The SQA website contains video links which show a strong common message from Education Scotland and SQA on TLA, HQTL, strong QA, close collaboration and institutional development. Independent Evaluation of recent changes in Scotland In 2015 the Scottish Government commissioned the OECD to review its on-­going development of education policy, practice and leadership. The report: in the section summarising Assessment and Evaluation, makes the following points. 1. With the introduction of CfE, assessment became a part of the curriculum,

essentially part of, and not separate from, learning and teaching, in sharp contrast to the previous system. The formative assessment emphasis, with its range of methods to collect information, was designed to support learning.

2. The developmental emphasis of teacher appraisal and school self-­

evaluation come together into a coherent whole.

___________________________________________________________________ ColegauCymru / CollegesWales Page 28 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

3. Light sampling of learners in literacy and numeracy for national monitoring purposes kept the focus on assessing progress for improvement purposes rather than for strong accountability.

4. The support for learning, and trust towards teachers are very positive

aspects of the current system. However, they do identify a number of key concerns, principally a lack of a clear view and practice surrounding the role of assessment within the CfE framework:

1. The effective use of assessment information to support children’s progress in their learning, and development of the curriculum. “Our interviews during the school visits suggested a wide range of assessment practices, with some schools and teachers having difficulty in prioritising assessment tasks”.

2. This lack of clarity has created the risk that CfE “comes down to the examinations”. This in turn too often leads to excessive paperwork, a tick-­box approach, and the blurring of the close connection between assessment and improvement.

3. This has been acknowledged in the “Tackling Bureaucracy” drive but

suggests that it has been far from resolved. 4. Recently, the Scottish Government has outlined a National Improvement

Framework, which was still at proposal stage at the time of the OECD review, with “the potential to provide a robust evidence base in ways that enhance rather than detract from the breadth and depth of the CfE”

Given Scotland’s previous bold moves in constructing its assessment frameworks on the best available research evidence at the time, it now has the opportunity to lead the world in developing an integrated assessment and evaluation framework. We believe that it will be fundamental to maintain the dual focus -­ both on the formative function, while improving evidence on learner outcomes and progression. (OECD 2015 p.13, emphasis added).

2. Finland as good practice

Finland is held up by many commentators as a beacon of excellence in education. In the PISA results for 2012, the three-­yearly assessments of the knowledge and skills of 15-­year-­olds worldwide, Finland was ahead of every other European

___________________________________________________________________ ColegauCymru / CollegesWales Page 29 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

country in literacy and science and third behind the Netherlands and Estonia in maths. The OECD identifies Finland as a high performer internationally. The country has developed its system in the last 30 years as a partnership between the centre and the regions, from a centralised system with external tests to a local system where teachers develop their own curriculum within clear national standards, and TLA is used and trusted almost universally. Children start school in the year they turn seven, although many attend voluntary pre-­school. They are in school for ten years, starting with basic education and moving on to upper secondary at 16-­17. General upper secondary education is for students planning to go to university while vocational education is more geared towards training young people in the skills needed by Finnish employers but also qualifies them for university entry. There is no perceived difference in status between the academic and the vocational routes and it is possible to transfer readily from one to the other. That freedom is underpinned by a national entitlement to free education and training up to degree level, with no time limits (ColegauCymru/CollegesWales Study Visit 2015). Various reasons are given for Finland’s record of achievement in education, including:

long term development and high investment;; clear statement of national standards coupled with high independence of teachers;;

the high status of teaching qualified at Masters degree level -­ ten per cent of those who apply to teacher education programmes get in;;

a homogeneous society;; high levels of faith in the state system -­ there are no private schools;; a willingness to fund and support learners as a national entitlement;; the lack of status difference between academic and vocational;; the close involvement of employers in assessing the performance of learners in vocational programmes, and in updating educational practice in their fields.

The purpose of assessment in Finnish schools is only to improve learning. In its guidance, the Finnish National Board of Education (FNEB) insists that assessment is supportive, used only to improve learning;; it is about guiding and encouraging. Formative assessment over summative is emphasised at every stage and teachers have the freedom, within the national/local curriculum frameworks, to decide whether a student is learning and progressing well. The formative assessment becomes summative (albeit still teacher-­led), designed to show how well a pupil has met the learning targets within nationally-­set criteria.

___________________________________________________________________ ColegauCymru / CollegesWales Page 30 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Self-­assessment and developing children’s skills so they understand how to learn is central to the assessment process and education generally: “it is the task of assessment to help the pupil form a realistic image of his/her development” (FNEB). Learners receive verbal feedback with teachers guiding pupil to understand how, as well as what, they are learning. A final verbal assessment explains how the teacher believes the student has met the objectives in each subject – progress made, work skills developed and behaviour. Each year, they may also be awarded a grade between 1 and 8 and the expectation is that grades will be used in all core subjects by the eighth school year (Year 10 equivalent), if not before. When the student leaves basic education at 16, they receive a final overall assessment. Teachers record their judgements in an end-­of-­year report. Pupils and parents are told in advance what the criteria are for assessment and, if they choose, can receive an explanation of how the criteria have been applied. There are no awarding bodies as the state, with local authorities, determines standards and teachers assess. There are no examinations until the end of upper secondary education when students take a national matriculation test at 18. Matriculation is made up of a minimum of four tests: the ‘mother tongue’ test plus three others chosen from a list including a foreign language (often Swedish, the second language of Finland) maths, science and humanities subjects. The tests are initially marked by local teachers but the marks are reviewed by members of the national Matriculation Examination Board who have the final say. In spring 2014, around 30,000 students took the exam. The universities operate their own entrance tests for some subjects. There are no high stakes reporting of results and no league tables in Finland, but teachers evaluate their own judgments of how their pupils have done and they have access to national performance data which they use to refine and improve their practice. Schools’ results are not ranked by the government for publication. Schools and colleges use self-­assessment. According to Pasi Sahlberg, a former teacher and policy adviser to the Finnish government, teachers make good use of the time liberated from tests:

If you talk to teachers in Finland, they will say that the absence of standardised tests means they can really focus on helping all of the students to learn,” he says. “They don’t need to worry about, for example, preparing children for SATs or national assessments, so they have more time to really focus on improving children’s learning and understanding.

He adds that no tests means no league tables for the media to “pounce on” and no “toxic” competition, which allows collaboration and cooperation to flourish:

___________________________________________________________________ ColegauCymru / CollegesWales Page 31 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

The aim of policymakers like me when I used to work [with government] was to protect our system, schools and teachers from this unhealthy competition which often comes when the data has to be made public. (Sahlberg P, 2015)

3. Australia as good practice

According to the 2012 PISA results, Australia performed equal 10th in reading, equal 8th in science and equal 17th in mathematics. On each indicator, Australia performs well above the OECD average and is considered to be a “high-­quality, high-­equity” education system. The average performance of students in the Australian Capital Territory, News South Wales, Queensland and Western Australia was at a higher level than the OECD average in each of mathematical, scientific and reading literacy. Even so the general trend in PISA results in Australia is down, and there are significant differences in practice and performance across the country, including substantial inequalities for students from low deprived backgrounds, indigenous students and those living in remote parts of Australia.

The literature is generally very positive about the Australian model. The following is a conclusion drawn from a wide review of the literature done for the SQA:

Australia‘s assessment programme necessarily incorporates criteria-­referenced forms of assessment and teacher professionalism in judging student achievement. These norms in turn require explicit, transparent and public performance goals and criteria. Finally, the system is supported by a detailed and comprehensive system of external moderation. Student samples are judged against stated criteria by external moderators to assure reliability. Thus, the Australian programme appears to have solved some of the noted issues involved in using teacher-­based assessment for high-­stakes decisions including problems with objectivity, reliability and transparency. (Van den Bergh, Mortelmans, Spooren, Van Petegem, Gijbels and Vanthournout, 2006).

Two states out of six have been chosen to represent the Australian system, New South Wales and Queensland. [The description which follows is taken largely from Stanley G et al, Review Of Teacher Assessment: Evidence Of What Works Best And Issues For Development, Oxford University Centre for Educational Assessment, QCA (2009).] The New South Wales (NSW) system

___________________________________________________________________ ColegauCymru / CollegesWales Page 32 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

NSW has the largest education system in Australia and has two major points of certification: at the end of Year 10 for the School Certificate (SC), and at the end of Year 12 for the Higher School Certificate (HSC). Each year over 80,000 students attempt the SC and roughly 65,000 attempt the HSC. The NSW Board of Studies is responsible for both the underlying curriculum and the conduct of these two credentials. Prior to 1998, the NSW system of assessment was norm referenced for both reporting points, with the marks being standardized to fixed percentages. In 1998, the SC moved to a standards referenced system and in 2001 the HSC followed suit. The two systems of assessment differ in how they operate, the SC being low stakes for students and the HSC very high stakes. The system can be characterised by a balance between testing and TLA, with the latter being carefully moderated through statistical comparison. In NSW the state curriculum is organised into six ‘Stages’. In the junior years of school in NSW student progress is reported on a five-­grade scale from A to E and no external moderation takes place. To help teachers assign grades to their students for the purpose of reporting to parents, work samples are provided by the Board of Studies for each grade level. At this junior level, there are only the general descriptors, given meaning through the provision of work samples. For Stage 4 and under, the emphasis is on diagnostic testing to improve learning. For a number of years there has been state-­wide testing which has played a mainly diagnostic role. External testing has been introduced in Literacy and Numeracy in Years 3, 5, 7 and 9 across all Australian state systems by the Federal Government. To date such testing has not been used as a strong school accountability measure and therefore has functioned as relatively low stakes. Nevertheless as school accountability regimes become more performance oriented there are signs of this changing to higher stakes at least for teachers and schools. School Certificate External Measures At the School Certificate, student results are a mixture of teacher assessment and external measures. Compulsory external tests are held in the subject areas of English Literacy, Mathematics, Science, History and Geography. The entire Year 10 (over 80,000 students) sits for these tests. The testing covers a range of item types, including multiple-­choice, short answer and extended response. Apart from the multiple-­choice (which are computer-­marked), all responses are written in pen and paper in answer booklets, which are marked externally in various marking

___________________________________________________________________ ColegauCymru / CollegesWales Page 33 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

centres. Standards-­referenced achievement on these tests is reported in six bands, from Band 1 (the lowest) to Band 6 (the highest). An additional compulsory test, Computing Studies, is tested online with a choice of sessions over three days, the students using the computer labs objective, these items are computer-­marked. This external testing results in three indices of achievement:

(i) a scaled mark (/100) (ii) an achievement band for the test (from 1 to 6) (iii) verbal descriptions of a typical student’s performance for each achievement band.

Internal measures In addition to the external measures there are school assessments that cover each course more broadly than the external tests. Generally, the school produces a student rank order in each subject, based on a combination of classroom tests, assignments, presentations and so on. Once these school groups are identified, the school principal is informed of the results and asked to investigate within the school whether the grading pattern is justified. On many occasions they resubmit an amended pattern of grades. However, there is no compulsion to alter their grading pattern. Higher School Certificate (HSC). External Measures and moderation

The HSC is a high stakes system based on public examinations where the results are ultimately used for tertiary entrance selection. There is reporting where the Board of Studies indices of achievement are given on the HSC Record of Achievement. In addition, the Universities Admission Centre (UAC) is given access to these student results and conducts a further scaling which creates a single common scale across all subjects. From this, UAC produces a general achievement ranking, the Universities Admissions Index (UAI), which is used to select which students may be admitted into particular university faculties if demand exceeds the number of places available (which frequently occurs). Schools also submit assessments for each student in each course. These assessments count 50% toward the final result in each course in the tertiary entrance scaling;; they are high stakes measures. The Board of Studies issues assessment parameters that set down the components of the assessments and their weightings. It also constrains the number of assessment tasks that can be used, so that students are not placed under too much pressure. As a result of their internal assessment programme, schools produce an assessment mark out of 100, in which the rank order and gaps between the students’ marks are considered to be important information. It has been long accepted, certainly from before the 20 year self-­imposed limit for this literature review, that within a school

___________________________________________________________________ ColegauCymru / CollegesWales Page 34 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

the teachers can rank their students quite accurately (for example, Elley and Livingstone,1972). However, it is not accepted that they can accurately place their school group in relation to other schools, and there is evidence that the schools that most inflated their assessments were the lower achieving schools. For such a high stakes measure, public confidence requires a moderation method where the moderated assessment distribution for a school subject group closely reflects the distribution of their examination marks. The Board of Studies determined some fundamental principles that have received general acceptance.

First, the rank order of the assessment is determined by the school. Second, for a school subject group, the highest moderated assessment should reflect the highest examination mark.

Third, the mean of the moderated assessments should reflect the mean of the examination marks.

The Queensland system The Queensland system contrasts with the more conservative approach taken by NSW. The system has its origins in the introduction of the Radford Report in 1970 which recommended the abolition of all public examinations. Since then further changes have taken place, notably arising out of the White Paper published in 2002, Education and Training Reforms for the Future (ETRF). Assessment in Queensland is continuous, internal, and criterion-­referenced, and is carried out by teachers. It is heavily moderated and standardised. The strong element of TLA allows schools to align their assessment with the learning and teaching and so help to ensure that assessment:

meets the students’ needs and teachers’ requirements reflects the local context takes place when the students are ready supports the learning process.

The Queensland Curriculum, Assessment and Reporting (QCAR) Framework. The last four years of secondary education are Years 9-­12. The school-­based system has gradually evolved over forty years. The focus is on improving the effectiveness of classroom assessment in three areas:

the capacity of teachers to make informed judgments about student work against the standards embodied in the curriculum;;

___________________________________________________________________ ColegauCymru / CollegesWales Page 35 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

the capacity of teachers to obtain information about student learning to inform future teaching, and:

the feedback given to students about their learning. The framework comprises five components which support the teaching and learning processes:

1) A set of ‘Essential Learnings’ in each subject describing what the learner should know and be able to do at various points, with assessable elements. For example in Year 9 Science, there are four assessable elements: knowledge and understanding, investigating, communicating and reflecting. 2) A standards-­based approach for the Essential Learnings which define levels of achievement and articulate the type of performance that is required to achieve grades from Grade A to E. 3) An Online Assessment Bank which provides quality assessment tasks, materials and resource across all subject areas: “A Guide to making Judgments, Teacher Guidelines and sample responses”. Wider classroom and professional resources are also supplied. The bank also provides an online forum for informal teacher collaboration and discussion about assessments to contribute to a shared understanding of standards. 4) Queensland Comparable Assessment Tasks (QCATs). These are authentic (in the sense of being realistic), performance-­based assessments that provide teachers and parents with information about student learning, and are held in Year 4, Year 6 and late Year 9.

As they are not used for measuring teacher or school effectiveness the QCATs are seen as relatively low stakes. Their primary function is for diagnostic Review of Teacher Assessment purposes in the classroom and to assist teachers in grading to a common state-­wide standard.

5) Guidelines for the reporting of achievement.

The Queensland Certificate of Education (QCE) and the Moderation of Senior Assessments In Year 12 the QCE is awarded by the QSA to those who have successfully completed the senior phase of schooling. The QCE itself shows the subjects undertaken in school and the level achieved, plus other certification arising from traineeships or apprenticeships, training programmes, university subjects and

___________________________________________________________________ ColegauCymru / CollegesWales Page 36 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

workplace learning. It also confirms that the student meets literacy and numeracy requirements, and records the Queensland Core Skills Test grade. Internal and external moderation The QSA provides external moderation for Queensland schools covering aspects of validation and approval and external verification. QSA’s external moderation is aimed at:

ensuring that students are assessed to the same standards across the state;;

checking that the assessments match the demand and requirements of the syllabus;;

demonstrating the transparency of, and accountability in, the system. School assessments are reported on a five-­point scale as follows: Very High Achievement;; High Achievement;; Sound Achievement;; Limited Achievement;; and Very Limited Achievement. Within each school, a school moderator is appointed to oversee the entire school programme. The schools also have in place an internal moderation system with nominated teachers responsible for internally moderating assessment decisions. This helps to give rigour to the internal assessment process. Apart from the in-­school moderation, an external consensus or social moderation takes place. Using Moderated Assessments -­ the Overall Position (OP) The OP is a ranking given to a student, based on the student’s average score across the best five subjects (the scores averaged being moderated school assessments). They are intended for use in tertiary selection and hence are very high stakes. They are presented in 25 bands, from OP1 (the highest) to OP25. Although internal assessments are moderated and reported in a five-­point scale the QSA regards these categories as too broad, and the fine-­grained school assessments called Subject Achievement Indicators (SAIs) are used. These are norm referenced: marks range from 400 (the highest performer in that subject in the school) to 200 (the lowest performer in that subject in the school). This scale refers only to the school – such marks are not comparable across schools. To gain comparability the QSA uses the external Queensland Core Skills (QCS) test. The external moderation process QSA’s external moderation process appears to be extremely rigorous and demanding. It involves:

___________________________________________________________________ ColegauCymru / CollegesWales Page 37 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

approving the work programmes and study plans of schools monitoring at Year 11 the schools’ implementation of syllabuses in Authority subjects

verifying at Year 12 the schools’ judgments about student achievement in Authority subjects

confirming judgments made by schools on exit levels of achievement in Authority subjects

random sampling of students’ exit folios in Authority subjects to assess comparability after exit levels of achievement have been awarded

QSA is supported in these moderation duties by District Review Panels and State Review Panels. District Review Panels are the first point of contact for a school, and are made up of experienced teachers. Their role is to:

maintain standards through monitoring, verification and random sampling;; check work programmes, and review folios of student work;; make recommendations to the appropriate State Review Panel

State Review Panels oversee the work of the District Review Panels and include practising teachers from schools and universities. The State Review Panels:

consider work programmes approve these or give advice through District Review Panels

oversee the work of District Review Panels, through sampling, to ensure that advice to schools from District Review Panels is consistent across Queensland

resolve issues and negotiate agreements between schools and District Review Panels

QSA trains and develops the people who undertake these functions. Credentialling To gain their credentials, provisional moderators in schools need to demonstrate that they have undertaken training, been professional and made high quality contributions in their work. External assessment -­ The Queensland Core Skills Test (QCS Test ) The QCS Test is the common test offered state-­wide to all Year 12 students. It is set, assessed and marked by the QSA, and was developed to complement Queensland’s system of school-­based assessment through standardisation – to compare and scale TLA. It contributes information to the Tertiary Entrance

___________________________________________________________________ ColegauCymru / CollegesWales Page 38 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Statement, which is compulsory for those students going on to further learning at college or university, and optional for other students. The QCS Test is designed to test the 49 common elements — the Common Curriculum Elements (CCEs) — that are the ‘threads of the Queensland senior curriculum’. It comprises four papers including: a writing task, a short response paper, two multiple choice papers marked electronically. Markers are recruited from advertisements on the website and in the press. The QSA looks for markers who can demonstrate an ‘understanding of assessment based on criteria and standards’. It is assumed that schools will be keen to ensure that their staff apply for positions as markers. Independent evaluation of Queensland’s standardisation and moderation In 2011 an independent Comparability Review of the QSA procedures concluded that:

Queensland has created an internationally respected model of assessment and the policy leaders in Queensland should be proud of their success. (Marion, Peck and Raymond 2011)

In 2012 an independent review of the QCS test concluded:

“the test itself is of high quality due to its design criteria, the care and expertise of those involved and the disciplined procedures used in its development. The administrative procedures are well established and run smoothly and effectively. The marking processes are thorough, carefully applied and well monitored.

The review team is of the view that overall the QCS Test continues to perform well the functions for which it was designed and introduced. The recommendations in the report are designed to enhance and further improve the test instrument and its administration, and extend its viability and effectiveness as a scaling instrument well into the future. (McLeod Davidson et al 2012)

Issues with a largely TLA based system There is no doubt that internal assessment allows Queensland schools to align their teaching with assessment to ensure that assessment supports the learning process. However, to demonstrate that the system is robust, the QSA has developed a demanding range of controls and checks. These may be time consuming and perhaps inflexible but they also provide a framework for teacher involvement and collegiality.

___________________________________________________________________ ColegauCymru / CollegesWales Page 39 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

It is clear that the process of gaining approval for work programmes requires considerable work on the part of the schools, the QSA and the panels convened by the QSA to review them. The workload from internal assessment in schools has also to be borne in mind. As there is no external assessment all the assessment workload falls on the staff in the school (as well as the students of course). The need to offer an external test when there has been internal assessment means that there will always be two systems in operation ― one supporting internal assessment with external moderation at Years 11 and 12, and the other supporting the QCS Test. This will create workload issues for the QSA as well as the schools. Despite all the efforts to balance TLA with internal and external moderation systems and supplementary tests the heavy reliance on TLA still raises concerns, albeit very carefully expressed:

It might be thought that this degree of externality would be sufficient to allow young Queenslanders to demonstrate their fitness to enter, for example, HE. Anecdotal evidence, however, tends to indicate that there might be a perception in Australia in general that those emerging from the Queensland internal assessment system will always be seen as less well qualified than those who have been through an external assessment system in other states. (Galloway A 2008)

What follows is the findings of a comparison and commentary on the systems in Finland and Queensland, commissioned for the SQA. It is a summary of a report commissioned by the SQA (Galloway A 2008) Finland and Queensland;; the common messages The SQA commissioned a review of the assessment systems of Queensland and Finland to identify issues arising out of operating flexible internal assessment systems. The main findings were as follows. Assessment

A significant benefit of internal assessment is that it makes it possible for learning and teaching to be aligned with assessment.

Teachers who work within an internal assessment system feel a sense of ownership for the system.

Processes

Internal assessment processes need to be rigorous to ensure that assessments maintain credibility.

___________________________________________________________________ ColegauCymru / CollegesWales Page 40 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Operating these processes can be costly in terms of time, personnel and other resources.

Some systems appear flexible, but the need to implement rigorous processes to support them can end up reducing that flexibility considerably.

Quality assurance

Despite all efforts to ensure rigour, internal assessment can reduce the credibility of qualifications (this is seen as a perception, not an evidence-­based conclusion), so people emerging from a system of internal assessment in school still need to undertake an external assessment on exit.

This external assessment is needed to demonstrate to all users (students, employers and tertiary education) that the students have met the same standards as other students who have been through a system of external assessment (in other states in Australia).

The arrangements for the external exit assessment are likely to be costly. Other systems of review or evaluation might be needed to demonstrate the rigour of an internal assessment system. (Galloway A 2008)

___________________________________________________________________ ColegauCymru / CollegesWales Page 41 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

4. South Korea as good practice?

South Korea is also a high performer in PISA tests: 5th in Science and 7th in Maths and Reading. It is a country with a system with extensive use of TLA, in which the high stakes of external tests are minimised by the use of lottery entrance to high school and limited external reporting, up to the end of high school and university entrance.

It has a system of diagnostic assessments called the National Assessment of Educational Achievement (NAEA). Each year, achievement tests in two subjects are administered to all students in each of the grades six, nine and ten. These tests serve a purely informational purpose and are not reported at the level of individual student.

Students in South Korea’s school systems that are designated “equalization areas” are admitted to senior high school based on a lottery system, and so do not face a high-­stakes examination until the end of senior high school. These are large urban areas in which private tutoring is rife, and seen as the antitheses of equality). Students in other parts of South Korea are required to take a school-­administered entrance examination as part of the admissions process to senior high school. However, they are also assessed on the basis of their junior high school performance, so the test is less high-­stakes than was previously the case.

Students are also regularly assessed by their teachers at all levels, and they receive “Student School Records” or “Student Activity Records” which provide detailed information about their academic performance. These records include information on academic achievement by subject, attendance, extra-­curricular and service involvement, special accomplishments, conduct and moral development, physical development, details of awards and anecdotal performance descriptions. These records are increasingly used as measures of student performance for admission at both the senior secondary and university level, in order to alleviate the examination-­based pressure felt by many South Korean students.

Following senior high school, students who want to continue to university must take a College Scholastic Ability Test (CSAT), which has a major impact on their higher education prospects. Leading up to this test, most South Korean students will engage in some form of directed study outside of school, ranging from classes at hagwons, or cram schools, to private tutoring sessions. There is a culture of “examination hell” in South Korea, and students often feel large amounts of pressure about their performance on the CSAT.

___________________________________________________________________ ColegauCymru / CollegesWales Page 42 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Broad findings in the research related to the fitness of purpose of the assessment instruments used in UK qualifications, and the impact of High Stakes reporting. The results of assessment can be used in a wide variety of ways, including:

Internal uses: Informing learners of their progress, strengths and weaknesses (formative)

School tracking of progress of individuals, teachers and departments Informing parents Informing teachers in the next class or stage

External uses: Certification by an external body Selection for employment of HE Reporting for accountability (teachers, departments, institutions, local authorities, nations).

Whilst all of these are very important, they can also become high stakes depending on the use to which the data is put.

It is when information is used for important decisions, not just for the learners but for institutions and their staff, that they become high stakes and potentially harmful, and arguably not fit for purpose (Harlen 2005).

Some of those uses include:

Teachers: (reputation, job advancement and job security);; Departments: (reputation, resources, job advancement and security);; Institutions (reputation, market position, security of senior managers;; institutional existence);;

Local authorities: (reputation, security of senior managers and future existence);;

In a few countries, England for example, since 2008 the future existence of a school is threatened by low performance, measured by examination results (OECD comparative table, 2015). The literature is clear that such high stakes use is inappropriate as it distorts practice and in doing so damages HQTL.

___________________________________________________________________ ColegauCymru / CollegesWales Page 43 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Just a short reflection on this list hints at the potential for distortion of results, standards and learning. In some studies it has been demonstrated that even when TLA is central and standards are clear and appropriate, high stakes are likely to corrupt the process (Klenowski V. 2013). This distortion can occur in TLA, in the learning process, in the content of learning programmes and in behaviour towards learners. It does that in many ways through offering incentives or pressure for:

teaching to the test/repetitive training in key assessed elements/selective teaching of dominant content;;

influencing teacher judgement of standards;; pre-­selection for examination entrance;; focussing resources on marginal learners;; pressurising and lowering learner motivation;; excluding high risk learners;; covert selection;;

and may also lower the motivation of learners (especially the weaker).

Finally it can affect the design of learning programmes by:

narrowing the content of learning to that which can be easily measured;;

excluding important skills such as problem solving, critical thinking and team work.

If it is accepted that raising standards of achievement through HQTL is paramount, and that effective teacher action is by far the most important element in improving standards over time, it follows that a move towards extensive, but not exclusive, use of TLA makes much sense;; that is the view supported by the literature. Accompanying that with robust moderation and standardisation is essential.

___________________________________________________________________ ColegauCymru / CollegesWales Page 44 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Implications for awarding bodies in terms of assessment design and quality assurance. Design of Programmes and assessment practice Most major awarding bodies (ABs) are now commercial organisations operating in a competitive market;; some are profit-­making, and many are self-­financing. That in itself may give cause for concern and issues for regulators, particularly in high stakes contexts where institutions have an incentive to choose ABs and subjects by pass rate and ABs have an incentive to maximise market share. However the ABs are also subject to regulatory control and to the recognition of their programmes for government funding, so there is clearly an issue for policy makers to consider those incentives and how to minimise their effects. Finland has no awarding bodies. More autonomy from central control has been given to general vocational education and training than general education, and for that reason we can see assessment policy that has developed in response to need and with greater freedom from central control. The extra autonomy may be a reflection of its lower profile in the minds of politicians and the press, and it is certainly a reflection of the historical origins and diversity of these qualifications. It is also the case that there has been a wide variety of assessment models and methods used, if only to reflect the diversity of the subjects and industries being served. Whatever the reason, there has been great variety and many interesting and revealing practices in general vocational qualifications (GVQs) in the UK. For the sake of practicality this section focuses on Edexcel BTEC qualifications, the largest and most successful body of GVQs in the UK. BTEC qualifications began in the late 1970s, and for 25 years before 2012 the specifications were largely unchanged and quite radical in design, amongst other things being 100% TLA. They were certainly successful, and had the confidence of employers and most HE institutions. Although there have been instances of poor and controversial practice in BTEC qualifications over the years they do not constitute serious concern about the general approach and the good practice contained in them. The good practice in GVQs outlined below is that described in the literature, and includes the following characteristics:

___________________________________________________________________ ColegauCymru / CollegesWales Page 45 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

a strong emphasis on TLA with significant institutional autonomy;; central control alongside centre autonomy;; a clear QA system with moderation and standardisation;; through the appointment, training and support of internal verifiers (IVs – internal assessment specialists), and External Verifiers (EVs);; (and, it must be said, the devolvement of much of that cost to institutions);;

specifications which are outcomes based with clear standards;; the provision of detailed and clear guidance on teaching and assessment for centres and for teachers;;

the early identification and use of ‘cross modular/unit’ learning and assessment for deeper learning and greater realism;;

an emphasis on realistic learning through industry based examples and employer involvement;;

the encouragement of learner involvement in practical activities and their assessment;;

the use of ICT-­based formative and summative assessments and their attendant flexibility

What follows is largely taken from the BTEC website: “BTEC Centre Guide to Assessment: Entry Level to Level 3” March 2016. As such it is a marketing exercise and should be viewed with suitable caution, but it is an accurate description of current practice. The stated aims of the new specifications in 2012 are to:

ensure high quality and rigorous standards conform to quality criteria for non-­GCSE qualifications be fit for purpose for learners, pre-­ or post-­16, in schools and in colleges

Assessment until 2012 was 100% TLA. When the new specifications were introduced ‘following policy developments’ and the ‘Review of Vocational Education – The Wolf Report’ (March 2011), an element of NEA was introduced. According to Edexcel, the Wolf Report ‘recommended that all high-­value vocational qualifications should contain an element of external assessment’. It did not and neither did the government’s response to it, but this may be a reflection of Edexcel wishing to follow the emerging policy direction in England at the time. Assessment now remains at least 75% TLA, largely through portfolio assessment of centre devised tasks, and a maximum of 25% externally set and marked assessments.

___________________________________________________________________ ColegauCymru / CollegesWales Page 46 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

The majority of BTEC units are assessed through internal assessment, which means that you can deliver the programme in a way that suits your learners and relates to local need.

It is interesting to note that a common selling point of BTEC qualifications, evident not just from BTEC themselves but also from school and college websites, has been the extensive use of TLA. Responsibility for validity, reliability and fitness for purpose of TLA is clearly placed within institutions (backed up by a QA system) that:

ensure each assessment is fit for purpose, valid, (and) will deliver reliable assessment outcomes across Assessors

The assessment is placed within a structure in which appropriate content is specified in detail, the learning process and its intended outcomes are clear and located within the specific vocational sector, and links to the broader curriculum are specified. Four key principles are stated 1. Standards: a common core and external assessment An essential core of knowledge and applied skills is specified. Assessment includes up to 25% external assessment appropriate to the sector, to provide independent evidence of learning and progression alongside the predominantly portfolio-­based assessment. 2. Quality: a robust quality-­assurance model A quality-­assurance model to ensure robust support for learners, centres and assessors.

each learner’s work is independently scrutinised through external assessment

every BTEC Assessor takes part in a sampling and quality review in each cycle

every BTEC centre visited annually for review and support of QA processes:

We believe this combination of rigour, dialogue and support will underpin the validity of the teacher-­led assessment and the learner-­centric approach that lie at the heart of BTEC learning.

___________________________________________________________________ ColegauCymru / CollegesWales Page 47 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

3. Breadth and progression: a range of options building on the core All BTEC qualifications contain:

The essential core: developed in consultation, for a broad understanding and knowledge of the sector.

Optional units for a closer focus on a vocational area, supporting progression to level 3 vocational or academic course or into apprenticeship.

Opportunities to develop skills in English and mathematics are indicated in the units in naturally occurring contexts. The skills have been mapped against GCSE (including functional elements) English and mathematics subject content areas. 4. Recognising achievement: opportunity to achieve at level 1 As some learners may fail to achieve a full Pass at Level 2, the opportunity to gain a level 1 qualification is included: Guidance provided for centres -­ internal assessment and QA from Entry to Level 3. Detailed guidance is available on the following:

The programme team and their roles and responsibilities, including: Programme Leader Lead Internal Verifier Internal Verifier Assessor Course documentation, including: Staff handbook Planning Planning assessment Conflict of interest Planning internal verification Assignment planning process The Learner handbook Planning units, learning strategies and external links Assessment strategies Peer and self assessment Group work Authenticity and authentication

___________________________________________________________________ ColegauCymru / CollegesWales Page 48 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Plagiarism Observation records and witness statements Assignment design Assignment planning Assignment briefs Internal verification of assignment briefs Designing assignments for retakes Assessment and grading Providing feedback to learners Submission of evidence Meeting deadlines Opportunities for resubmission Retakes Marking spelling, punctuation and grammar Learners moving onto a larger qualification Grading Internal verification of assessment decisions Learner appeals Recognition of prior learning Functional skills

Assessment tracking and recording Retention of learner evidence and assessment records

External tests in BTEC There are 3 different types of external assessment, designed to ensure that learners are assessed in a way appropriate for the sector: Paper-­based set tasks, on-­demand/on-­screen tests and written timetabled exams. Not all BTEC external assessments are examinations in the traditional sense;; e.g. in Sport Level 2 it is an onscreen test that is completed online under controlled examination conditions. The onscreen test is available ‘on demand’, which provides flexibility for centres in relation to completing the assessment. Additional support is available in the form of completed answer papers including Distinction-­level answers, with commentary from the Lead Examiner, available on-­line.

___________________________________________________________________ ColegauCymru / CollegesWales Page 49 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Suggestions for the development or propagation of good practice or for improvement of existing practice. Good practice and sensible recommendations for managing change occur throughout the literature, and in many countries. It would be possible to cherry pick these to contribute to a strategy and policy determined by the conditions and the politics of the country. An alternative might be to emulate the practice of one country, modified to suit local conditions and building on achievements to date. One difficulty with either of those approaches is that in most countries change is rapid, and in many it does not follow a straight path;; it is erratic to the extent that governments change, and change their minds. It is also difficult to find countries with long-­term direction accompanied by robust independent review. There are countries with long-­term, sustained systems that are well embedded, and they are performing well. However, as mentioned above, care is needed in assuming causality and, given the differences between countries, direct comparisons are difficult. The OECD in their evaluation of developments in Scotland stress that solutions found in one country cannot wisely be simply transplanted into another;; nonetheless they base their recommendations on international comparisons. (OECD 2015). It is the impression of the author that developments in Wales are now a long way along the path which emerges from the literature;; there is certainly much good practice in Wales to disseminate and develop further. The recent introduction of the new Welsh Baccalaureate exemplifies that;; its development hinged on extensive consultation with professionals both in the preparatory and implementation phase;; it is skills-­based alongside a core of subject-­based national qualifications;; the ‘World Skills’ element includes important skills that are clearly important but traditionally difficult to assess such as creativity, critical thinking and problem solving, and their development can be integrated in the national qualifications;; assessment of those skills depends heavily on TLA, supported by centrally provided resources and long term CPD. Its success and the way it is perceived will be interesting and important. In looking for effective implementation there are two obvious countries on which to focus: Finland and Scotland. Both are broadly comparable with Wales in terms of size, bilingualism and a more than usually homogeneous social make-­up. Both have clear long-­term strategies;; both have bigger, influential neighbours. Finland is particularly stable although recent economic and political pressure may cause change. Finland is performing exceptionally well internationally. Scotland also shares a border with England and has a close historical relationship with that

___________________________________________________________________ ColegauCymru / CollegesWales Page 50 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

country, albeit not so close as Wales. Closer investigation of working with either or both would be wise. Recommendations for policy and change management. Black and Wiliam (2001) make the following general points:

Base policy on research, (there is enough known already to start);; Improve formative assessment even though that is slow, there is no ‘quick fix’;;

Develop slowly through a sustained programme of CPD and support;; and they identify a ‘four point scheme for development:

1) Teachers need living examples from teachers, to derive conviction and confidence: start with small groups of connected institutions working together.

2) Dissemination – ‘let teachers find their own way’. 3) Reduce obstacles such as: excessive curriculum content;; the misuse of external tests;; a lack of understanding and skills – develop formative and summative skills together.

4) Research – fill the gaps we have in our knowledge. The main plank of our argument is that standards are raised only by changes which are put into effect by teachers and pupils in the classroom. (Black and Wiliam, 2001 p.13)

Harlen et al (2005) in a wide review of 30 studies of TLA investigated the extent of and the problems inherent in using TLA for summative purposes. Their overall conclusions, amongst a wealth of detail about the beneficial effects of the studies on the learners involved and the difficulties faced can be seen in the following:

The findings of (this) review by no means constitute a ringing endorsement of teachers’ assessment;; there was evidence of low reliability and bias in teachers’ judgements made in certain circumstances. However, this has to be considered against the low validity and lower than generally assumed reliability of external tests. (Harlen W 2005, p.245)

The following were identified as policy and practice implications. Policy

When deciding methods and combinations of methods for summative assessment, bear in mind the shortcomings of external examinations and national tests.

___________________________________________________________________ ColegauCymru / CollegesWales Page 51 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

The intended purpose casts light on the methods chosen;; in particular whether the assessment is to be used for internal or external purposes.

Emphasise learning goals not assessment aims. • don’t judge teachers’ assessment by how well it agrees with test scores • identify detailed criteria linked to learning goals, not assessment tasks

Address the known shortcomings of teachers’ assessment, such as bias. Use (collaborative) moderation to develop teachers’ understanding of learning goals and related assessment criteria.

Practice

Clarity about learning goals is needed for dependable assessment by teachers.

School assessment procedures should protect against bias. Protect time for planning by teachers. Develop an ‘assessment culture’ in which good formative assessment improves summative assessment.

In Mansell et al 2009 an important theme is the extent to which reporting for accountability can damage teaching and learning through teaching to the test, the choice of ‘easy’ subjects etc. When commenting on using assessment for reporting on institutional performance they identify a number of caveats:

check for validity in terms of the purpose;; bear in mind the inherent unreliability of external tests;; and, publish with suitable ‘health warnings’ on the limits of the data;; their ‘snap-­shot nature, factors outside the institutions’ control such as deprivation;;

They cite good practice such as the use of value added data, and the provision of national attainment data for comparative purposes, such as in Scotland. They identify a number of ‘pressing challenges’: 1. CPD: “Too many teachers still fail to see the connection between accurate

and regular assessment and good teaching”;; 2. Pressures on time;; 3. The tick-­box culture;; 4. The need to be able to scale up development projects;; and, 5. The lack of attention to assessment in initial teacher training.

Looney et al 2011 identify their implications for policy as follows.

___________________________________________________________________ ColegauCymru / CollegesWales Page 52 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

1. Learn from the bottom up (top-­down approaches tend to come with little guidance).

2. Promote teacher professionalism through: a. investing;; b. sustaining long term development;; c. and suitable teacher appraisal schemes (others are critical of the distorting nature of the use of leaner attainment for this high stakes purpose).

3. Develop more cost-­effective approaches to assessment and be aware of cost-­benefit ratios. They cite the expensive but effective use of ‘human raters’ and ratings panels in this context and the use of ICT based assessment. There may be parallels with the use of IVs in GVQs, in which most of the cost is hived off to the institutions. Awarding bodies are now beginning to exploit the lower cost of these systems and their inherent flexibility for timing, repetition, availability on demand etc.

Harlen W. et al identify a useful set of standards for national policy makers which fit very well with the broad direction of the literature in an ARG publication. (Harlen et al 2008). Suggested standards for Assessment A. Generally 1. Assessment to support learning is at the heart of government programmes for raising standards of achievement. 2. Initial teacher education and professional development courses ensure that teachers have the skills to use assessment to support learning. 3. School inspection frameworks give prominence to the use of assessment to support learning. 4. Schools are encouraged to evaluate and develop their formative use of assessment. B. Formative Use of Assessment 1. Policies require schools and local advisers to show how all assessment is being used to help students’ learning. 2. Introduction of new practices in assessment is accompanied by changes in teacher education and evaluation criteria necessary for their sustainability. 3. Schools are accountable for using formative and summative assessment to maximize the achievement of goals. 4. National standards of students’ achievement are reported as a range of qualitative and quantitative data from surveys of representative samples.

___________________________________________________________________ ColegauCymru / CollegesWales Page 53 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

C. Summative Use of Assessment 1. Moderated assessment by teachers is used to report students’ performance throughout the compulsory years of school.

2. Moderation of teachers’ judgments is required to ensure common interpretation of criteria within and across schools.

3. Regulations ensure that arrangements for the summative use of assessment are compatible with the practice of using assessment to help learning.

4. Targets for school improvement are based on a range of indicators and are agreed through a process combining external evaluation and internal self-­evaluation. (Harlen W et al 2008) Implications for regulators from an assessment development and qualification regulation perspective that emerge from the research studies, together with: Where next for Wales? Regulation is not a major focus in the literature of TLA, but a considerable number of themes are evident.

1. Firstly, if it is accepted that the primary aim is HQTL, then an integrated model follows, one to integrate teaching and learning, assessment, evaluation and accountability. A strategy for regulation is one part of that wider strategy and model.

2. For an integrated model to develop there is a need for a clear strategic vision from the political centre, with appropriate independence and distance for regulatory bodies, inspectorates and institutions.

3. Regulatory bodies need to work closely with those responsible for teaching and learning in institutions and local authorities, with awarding bodies and inspectorates to ensure effective integration.

4. Whatever the mixture of TLA and external testing is chosen there is a clear preference in the literature for TLA to be dominant. That offers the possibility of well developed TLA supporting good learning, with testing being used to standardise, and to provide public reassurance. The literature is clear that using test results in high stakes applications will damage learning and is to be avoided.

___________________________________________________________________ ColegauCymru / CollegesWales Page 54 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

5. Regulators have a clear duty to ensure effective QC and QA through moderation. An issue to be resolved is the extent to which that is best done:

internally (e.g. by using trained IVs, as with vocational and general vocational qualifications such as BTEC, or ‘Raters’ as in Queensland, or:

locally, through local authority and/or FE networks, or: externally, through EVs, national sampling of work, statistical moderation

or of course some combination of each.

6. Regulators also have a duty to monitor standards over time, between different institutions and regions. Sample testing emerges from the literature as the preferred way to do that. In Appendix 2, two pieces of work are summarised with their recommendations to indicate some of the issues for regulators in this area.

OECD recommendations for Scotland – governance of integrated systems

In evaluating and making recommendations on the governance of the Scottish integrated model, Curriculum for Excellence (CfE), the OECD makes the following key points. It is as well to bear in mind that this is an evaluation of developments which have followed a consistent strategic direction for more than 20 years.

Effective modern governance is a balancing act between accountability and trust, between innovation and risk-­avoidance, and between consensus building and making difficult choices.

The key role of the (political) centre is providing strategic vision and enabling feedback on how well the goals are being achieved.

Trust and consensus are invaluable and take significant time to establish but they need also to be tempered with professional accountability, system-­wide evidence and system leadership.

The importance of stakeholder capacity is crucial, especially in schools, communities, and the “middle” (mainly local authorities).

They continue to make the following recommendations for Scotland. It is for others to decide what is sensible and possible in the context of Wales, but the accompanying narrative identifies what seems to be their more general relevance.

___________________________________________________________________ ColegauCymru / CollegesWales Page 55 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

1. Create a new narrative for the “Curriculum for Excellence”

A compelling narrative for the central strategy is needed which represents an act of political leadership. The narrative is to make the vision highly visible, clear and accessible, to restate longstanding aims, and build as far as possible on evidence about achievements to date. The narrative should focus on the primary importance of HQTL, the core matters of curriculum, assessment and pedagogy. It needs to be picked up and incorporated into the management of the system, and absorbed by the profession, schools, communities, parents, students, and the public at large. 2. Strengthen the professional leadership of CfE and the “middle”

The centre of gravity needs to shift towards schools, communities, networks of schools, and local authorities in a framework of professional leadership and collective responsibility. This means less emphasis on “running” the strategy centrally, through an implementation plan and consensus-­building towards professional leadership focused directly on the nature of teaching, learning and the curriculum in schools, networks and communities. This might best be realised through a forum aiming at growth, coherence and making connections, rather than a board managing the programme from the centre. The OECD at least in the context of Scotland in 2015 believe in reinforcing the middle (the local authorities), to support both the top and the bottom. If the LAs are given a more prominent role as part of a reinforced “middle”, together with the collegiate activity of schools, colleges, networks and communities, then their varied capacity and expertise will need to be addressed through processes of professional accountability. That process started in Scotland early through an audit of ‘readiness’ for the LAs and their schools. The OECD includes a number of examples of successful decentralisation. One is the Greater Manchester Challenge (GMC), which is close to Wales, and has been influential there to date.

“The GMC was a three-­year project that galvanised ten local authorities to promote system-­wide equity and improvement. Many strategies were used to achieve the goals of the project. Challenge Advisers worked with all the authorities to improve their practice and results. Leaders of successful schools worked with weaker schools across local authority boundaries to

___________________________________________________________________ ColegauCymru / CollegesWales Page 56 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

improve their leadership teams. The ten local authorities drove improvement together. Schools that excelled in particular areas trained teachers in other schools and local authorities. The authorities were able to overcome old rivalries in pursuit of the common good of students and the community.

By 2011, after three years, GMC schools were above the national average on all standardised test measures. Secondary schools in the most disadvantaged communities improved at three times the rate of the national average. “The success of the GMC is inspiring other systems to adopt similar strategies. These include Wales and also now Scotland.” (OECD 2015)

3. Simplify and clarify core guidance, including in the definitions of what

constitutes the “Curriculum for Excellence”

Given the ambition that CfE should be built into schools, colleges, local communities and networks of educators, it is important to reduce the bureaucracy that can stymie the bold collaboration and innovation on which CfE depends for its success. It is the system that needs to be clarified, not giving users coping strategies or more detailed roadmaps through complex documentation. That means making system-­wide policy expectations and guidelines more easily accessible, including for inspections;; ensuring that national and local quality assurance processes are aligned and proportionate. As the OECD states “[s]tudents, parents and local communities should be seen as the beneficiaries of the clarification and simplification process as much as the professionals in the education system”. Finally, the OECD concludes:

As Scotland’s bold curriculum becomes truly excellent, its accountability and improvement processes should resemble high-­performing systems in Europe and North America such as Finland, Estonia, the Netherlands, and Alberta, Canada. Scotland might benefit from collaboration with Norway and Sweden who are also building stronger cooperation and collective responsibility among groups of municipalities, to share and monitor their different strategies of leading from the middle. Leading from the middle is not about lack of accountability. Rather, it combines a transparent evidence base about outcomes and performance with the lateral, professional accountability that is characteristic of high-­performing systems.

___________________________________________________________________ ColegauCymru / CollegesWales Page 57 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Appendix 1: Scotland’s Approach to the curriculum and assessment

Extracts from the Scottish Government’s website, March 2016. The highlights and emphases are theirs. The editing is ColegauCymru’s. Assessment is one of three key strands of work in implementing Curriculum for Excellence. The other areas are curriculum guidance which was published in the Spring and the next generation of National Qualifications details of which were announced in June 2009. This document sets out the Scottish Government’s strategic vision for assessment within Curriculum for Excellence. The main differences from the existing assessment arrangements are that:

o Assessment practices will follow and support the new curriculum. This will promote higher quality learning and teaching and give more autonomy and professional responsibility to teachers.

o Standards and expectations will be defined in a way that reflects the

principles of Curriculum for Excellence. This will support greater breadth and depth of learning and place a greater focus on skills development (including higher order skills).

o A national system of quality assurance and moderation for 3-­18

will be developed to support teachers in achieving greater consistency and confidence in their professional judgements.

o A National Assessment Resource will help teachers to achieve

greater consistency and understanding in their professional judgements. There will also be a major focus on CPD to help teachers develop the skills required.

Overall, the strategic vision aims to create a better, fairer and more robust system that promotes quality of achievement throughout education.

It sets out the Scottish Government’s strategy on how to build on our existing strong foundations of effective approaches to assessment. In doing so, we will aim to maximise the impact of Curriculum for Excellence in raising standards of achievement for all learners. Scotland has many strengths in this area that have been developed through, for example, the Assessment is for Learning programme and National Qualifications. We are also building on our traditional strengths in learning and teaching. The findings of the Assessment Reform Group in the Analysis

___________________________________________________________________ ColegauCymru / CollegesWales Page 58 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

and Review of Innovations in Assessment (ARIA) and other international findings will also inform these developments.

The Framework for Assessment from 3 to 18 aims to create:

• a more effective assessment system which supports greater breadth

and depth of learning and a greater focus on skills development • through collaborative working, a better-­connected assessment system

with better links between pre-­school, primary and secondary schools, colleges and other settings to promote smooth transitions in learning

• better understanding of effective assessment practice and sharing of standards and expectations as well as more consistent assessment

• more autonomy and professional responsibility for teachers. Purposes of assessment

Information from assessment serves several important purposes:

• to support learning;; • to give assurance to parents and others about learners’ progress;; • to provide a summary of what learners have achieved, • including through qualifications and awards, and to inform future

improvements. Principles of assessment

Above all, assessment needs to meet learners’ needs and enable all learners to achieve aspirational goals and maximise their potential.

Assessment practice should follow and reinforce the curriculum and promote high quality learning and teaching approaches.

Assessment needs to support learning by engaging learners and providing high quality feedback. Assessment has to be fair and inclusive and allow every learner to show what they have achieved and how well they are progressing. It is important that the information coming from assessment is able to show the breadth and depth of learning. Assessment also has to involve high quality interactions and motivate learners.

Assessment approaches should be proportionate and fit for purpose: different forms of assessment are appropriate at different stages and in different areas of learning.

___________________________________________________________________ ColegauCymru / CollegesWales Page 59 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

We expect to see, and currently find, that pre-­school practice is very focused on personal development and feedback with experiences built around the rapidly developing child while in addition at the senior phase we have more formal and structured assessment practices. Assessment which is used as the basis for awarding qualifications needs particular safeguards to ensure fairness to all candidates and give confidence to colleges, universities and employers. Standards and expectations

Assessment approaches should help learners to show their progress through the levels and enable learners to demonstrate their achievements in a range of ways which are appropriate to learning. For learners to demonstrate that their progress is secure and that they have achieved a level, they will need opportunities to show that they:

• have achieved a breadth of learning across the experiences and

outcomes for an aspect of the curriculum;; • can respond to the level of challenge set out in the experiences and

outcomes, and are moving forward to more challenging learning in some aspects;; and

• can apply what they have learned in new and unfamiliar situations. Teachers can use these three aspects to decide when a learner has met agreed expectations and achieved a level

Assessing progress

Teachers assess progress constantly. They get to know their learners well, build up a profile of their progress, strengths and needs and involve them in planning what they need to learn next. Teachers… have access to support and materials to help them with this task including through the National Assessment Resource. From time to time teachers ….report on progress. Literacy and Numeracy

Curriculum for Excellence emphasises literacy and numeracy skills and aims to develop, maintain and extend these skills.

National Literacy and National Numeracy qualifications are being developed at

SCQF4 levels 3, 4 or 5. The qualifications will be awarded on the basis of a portfolio of a learner's work collected across a number of curriculum areas and a range of contexts of learning, life and work and will involve

___________________________________________________________________ ColegauCymru / CollegesWales Page 60 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

external marking by SQA. The qualifications will be flexible to meet the needs of all learners including adult learners in colleges and other settings. Ensuring consistency

• The practices for arriving at a shared understanding of standards and expectations involve teachers: • working together from the guidance provided to plan learning, teaching and assessment • building on existing standards and expectations • drawing on exemplification • engaging with colleagues to share and confirm expectations.

Scottish Government, education authorities and other partners will work together to build on local and national practices for quality assurance and moderation of assessment… to achieve consistency in standards and expectations and build trust and confidence in teachers’ judgements.

Teachers will also have access to a new national resource -­ the National Assessment Resource (NAR) -­ which will help them as they make their judgements about progress. At school level, group/cluster level, and in colleges and other providers … teachers need to have opportunities to discuss and share expectations across the curriculum with a view to achieving consistency. Exemplification material will help to make standards and progression clearer and support reliable assessment.

External moderation will focus on the judgements teachers make and on moderation practices. Education authorities will have a key role in ensuring that schools have suitable arrangements in place to support teachers’ judgements and focus on any action required for improvement.

The Scottish Qualifications Authority (SQA) .. will provide external quality assurance for National Qualifications to help achieve high quality and consistency in assessment judgements and quality assurance practices within schools, education authorities and in colleges and other providers.

Reporting

Parents and learners will receive:

___________________________________________________________________ ColegauCymru / CollegesWales Page 61 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

• regular information about their children’s strengths, progress and achievements • information about their children’s progress in achieving the Curriculum for Excellence levels in key areas of learning such as literacy and numeracy • At points of transition – teachers will work with children and young people to sum up achievements through profiles …of progress within and through the curriculum levels as well as progress towards qualifications in the senior phase. • how well all learners and particular groups of learners are achieving;; • the performance of children and young people in the school in relation to expected levels at particular stages in key areas such as literacy and numeracy;; • how the school is applying national standards and expectations.

In relation to National Qualifications, SQA will report on learners’ achievements through the Scottish Qualifications Certificate (a summary of personal attainment).

Accountability Schools and colleges should be able to provide … based upon self-­evaluation and will (including) the nature, population and context of the school or college: an account of how successful children and young people are in their learning and of the establishment’s areas for improvement …

Monitoring standards over time

The Scottish Survey of Achievement will be adapted and fully aligned with Curriculum for Excellence and will focus on attainment in literacy and numeracy in schools. National standards of performance in National Qualifications can also be monitored over time through, for example, Standard Tables and Charts (STACS) analysis. Scotland is committed to active participation in international assessment surveys including Progress in International Reading Literacy Study (PIRLS) in late primary, Trends in International Maths and Science Surveys (TIMSS) in primary and secondary, and the Programme for International Student Assessment (PISA) at age 15. These surveys allow analysis of the performance

___________________________________________________________________ ColegauCymru / CollegesWales Page 62 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

of Scotland’s children and young people over time and in comparison to other countries.

Giving an account … at education authority and national levels

benchmarking at education authority and national levels should:

• prompt reflection on practice. • be based on a broad range of valid and reliable information. •use tools and exemplification through nationally-­provided assessment

resource and moderation practices. •relate performance to that of young people with similar needs and

backgrounds in other schools and authorities. To enable schools to use benchmarking information, the Scottish Government will develop (reports) about learners’ performance at school level. The Scottish Government will not collate or publish aggregate information nationally.

Education authorities will provide assurance that schools in their area are consistently applying national standards and expectations. This will include an assurance that they are participating in both local and national moderation processes and that they are using these processes thoroughly

As part of inspections, HMIE will report on the effectiveness of improvement through self evaluation and make recommendations where practice needs to be improved… and will review the arrangements for moderation within that group of schools. This will support, promote and extend the quality and rigour of the moderation process and ensure regular national coverage.

___________________________________________________________________ ColegauCymru / CollegesWales Page 63 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Appendix 2: The development of a national monitoring system – sample testing and statistical moderation

A) Sample testing

This summary of a NFER paper is offered as an insight into the issues faced by countries which have used or still use national or international testing for monitoring and standardisation purposes. It has been edited for the sake of clarity and brevity.

In October 2008 Ed Balls MP, Secretary of State for Education, announced the end of testing at Key Stage 3 and stated that in its place a system of ‘national-­level sampling’ would be introduced. A paper by the NFER highlighted the key issues that should be addressed in the development phase of such a system. The paper provides case studies of the tests used and still in use:

• Assessment of Performance Unit (APU) used in England up until 1989 • National Assessment of Educational Achievement (NAEP), currently used

in the USA • Scottish Survey of Achievement (SSA) currently in use in Scotland • Trends in International Maths and Science Study (TIMSS), and • Programme for International Student Assessment (PISA).

Key Lessons identified APU

There is a need for clarity of purpose and definition from the start of the development.

Achieving the desired sample proved problematic, (it was initially 1.5%, for the first maths survey), especially as the number of subjects and the size of the required sample increased.

The administration of the tests with only seven pupils in each school made them logistically difficult to manage.

A decision is needed from the start about the reporting that will be required, and therefore the background data that will need to be collected.

The monitoring teams were instructed to present only facts in the reports with no interpretation. This limited the usefulness of the information.

The method used for analysing the results of the survey was controversial and led to difficulties with reporting certain aspects. For a new survey this would need to be addressed early in the process.

NAEP

The process of development took much longer than expected and the purpose(s) changed/ evolved. There are regular calls for further expansion of the tests to meet ever more purposes.

___________________________________________________________________ ColegauCymru / CollegesWales Page 64 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Measuring change in the system over a long period of time causes challenges, particularly with both keeping the same measure and keeping the measure relevant.

There is a disparity of survey results for those states with the ‘No Child Left Behind’ initiative (a high stakes initiative in some states linking institutional continuation with results).

The system is very complex which leads to misinterpretation of the results in the media and by the public.

There are issues with low participation rates among the older students and non-­representative samples of students in some sub-­groups.

The generally low stakes nature has been linked to lower than desired response rates and concerns that low motivation may affect reliability.

Even with this low stakes national monitoring system there is still the view that it has led to a narrowing of the curriculum.

A key aim of the NAEP tests is to report performance of sub-­groups of pupils: boys and girls, disabilities, different ethnic backgrounds, education of parents and so on. There have been some issues with collecting reliable evidence from these different sub-­groups.

In the USA item response theory (IRT) is used to analyse the data from the tests, making the assumption that a unitary trait of ‘proficiency in the subject’ and a ‘national population’ of pupils are being assessed.

SSA

The paper and pencil tests used in each survey did not cover practical skills. Because of the cost and logistics involved these latter were addressed in a very much less formal, smaller-­scale way. Field officers, conducted and rated the practical assessments which could lead to a high-­cost system.

Items and tasks were ‘leveled’ (A to F) using professional judgement, on the basis of the 5-­14 criterion-­referenced progression framework, before being put into the national assessment bank, from which they could be drawn at any time for survey use.

Teachers’ level judgements were also collected for the pupils tested in the surveys, although these were not intended for use in system monitoring. Disparities were evident.

Those teachers involved in the programme appreciated the professional development experience. This can be seen as a useful additional benefit of the survey programme.

There have been some minor concerns about the low stakes nature of the assessment and how this might have affected test performance.

Confidence intervals have been reported alongside the attainment results, indicating the precision of the measures being made of population attainment.

TIMSS

___________________________________________________________________ ColegauCymru / CollegesWales Page 65 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

It frequently proves difficult in England to achieve the required sample of schools willing to participate in the tests. Incentives have recently been introduced.

The tests are paper and pencil only and therefore assess a limited proportion of the curriculum.

Trends over time as measured by the tests have questionable reliability. The assessment framework reflects the needs of all the participating countries, so does not assess the whole of the National Curriculum.

PISA

The assessment of application of knowledge and skills rather than curriculum content is an interesting feature of the PISA surveys.

As with other studies mentioned it is not always easy to get sufficient schools to participate. England failed to meet its target in 2003.

B) Statistical moderation of teacher assessments

The following summary is from a report commissioned by the Qualifications and Curriculum Authority in 2005, (Wilmut and Tuson, 2005). It has been edited for brevity and clarity. This report is a review of statistical moderation of teacher assessments, undertaken for QCA to:

investigate technical issues relating to the statistical moderation of assessment results, with a view to a model for statistical moderation of teacher assessment results based upon student performance in external examinations

propose a single model of statistical moderation that would best suit the (QCA defined) notional assessment system.

They consider a GCSE in which all assessment is by teachers and assume that the assessment will be embedded in the curriculum. Their conclusions can be summarised:

The introduction of a subject-­based test for comparative purposes would damage the credibility of the teacher assessment.

It would be possible to develop banks of tasks for embedding in the curriculum, to sample the knowledge and skills and act as a benchmark for moderation of the teacher’s own assessments by linear scaling of the mean or mean and standard deviation;;

however, that process would be cumbersome, costly and would work better with moderation by inspection or consensus.

___________________________________________________________________ ColegauCymru / CollegesWales Page 66 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

“The only credible and wholly external calibrator would be a curriculum-­based test that could be used for a range of subjects and reported in its own right. A test that was based on generic skills would have the considerable advantage of supporting the development of these skills without specifying the nature of their embedding within the subject in question. This is our preferred approach particularly since the test would be taken once by all students, irrespective of the subjects being taken. (Wilmot and Tuson 2005)

Such a test would be very similar if not identical to the Queensland Core Skills Test (QCST).

___________________________________________________________________ ColegauCymru / CollegesWales Page 67 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Appendix 3: Future research needs in teacher-­led assessment post-­14

Future research directions in the TLA area might include:

The readiness of local authority areas, their schools and other institutions for the desired change.

How differences between schools influence the practice and dependability of individual teachers – the importance of institutional culture.

The depth of understanding in key policy makers of the central issues in designing an integrated system.

How teachers go about assessment for different purposes, what evidence they use, how they interpret it, etc.

The reasons for teachers’ over-­estimation of performance as compared with moderators’ judgments of the same performance to find out, for instance, whether a wider, perhaps more valid, range of evidence is used by teachers, or whether criteria are differently interpreted.

The effectiveness of different approaches to improving the dependability of teachers’ summative assessment, including moderation procedures.

Ways of evaluating or supplementing the dependability of teachers’ assessment other than by correlation with test results. The relative strengths of: internal moderation through key individuals and training processes;; local moderation through networks;; external testing for national sampling;; external standardisation through comparative standards;; national ‘core skill’ tests;; and so on.

The development or adoption of useable guidance for TLA.

___________________________________________________________________ ColegauCymru / CollegesWales Page 68 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

REFERENCES

Atkin J M (200&) Swimming upstream;; Relying on teachers’ summative assessments, measurements;; Interdisciplinary Research and Perspectives, 5:1, 54-­57, DOI: 10.1080/1536636360701293592 Assessment Systems for the Future, Nuffield Foundation. A project set up by the Assessment Reform Group in September 2003, directed by Wyn Harlen and based at the Faculty of Education, University of Cambridge. BTEC Centre Guide to Assessment: Entry Level to Level 3. Website, September 2016. Black P and Wiliam D. Inside the Black Box. Raising Standards through Classroom Assessment Kings College London School of Education 2001. Black P. 2013. In D Corrigan, R Gunstone, & A Jones (Eds). Valuing assessment in science education;; Pedagogy, curriculum, policy. Pp 207-­229) Amsterdam. Sage Black P. 2014. Assessment and the aims of the curriculum. An explorers journey. Unesco IBE. Black P, Harrison C, Hodgen J, Marshall B & Serret N, (2011) Can teachers’ summative assessments produce dependable results and also enhance classroom learning? Assessment in Education: Principles, Policy & Practice, 18:4,451-­469, DOI: 10.1080/0969594X.2011.557020 “BTEC Centre Guide to Assessment: Entry Level to Level 3”. March 2016 Harlen, Wyn and James, Mary, 1997. Assessment and Learning: differences and relationships between Formative and Summative assessment. Assessment in Education Vol. 4 No. 3, 1997 SCRE. Le Cordeur M. Constantly weighing the pig will not make it grow: do teachers teach assessment tests or the curriculum? Perspectives in Education 2-­14:13(1). University of the Free State South Africa. Report of the Wales Assessment Systems for the Future Conference Cardiff, May 27th 2006 Bennett, Randy Elliot;; And Others. Influence of Behavior Perceptions and Gender on Teachers' Judgments of Students' Academic Skill. Journal of Educational Psychology, v85 n2 p347-­56 Jun 1993 Brown, G. T. L., Hui, S. K. F., Yu, W. M., & Kennedy, K. J.. Teachers’ conceptions of assessment in Chinese contexts: A tripartite model of accountability,

___________________________________________________________________ ColegauCymru / CollegesWales Page 69 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

improvement, and irrelevance. International Journal of Educational Research. 2011 Elley WB & Livingstone ID (1972) External Examinations and Internal Assessments. New Zealand Council for Educational Research Galloway Ann 2008. The Assessment Systems of Finland and Queensland. SQA Report 1. Commisioned by the SQA 2008. Gill, T., and Benton, T. (2013). Investigating the relationship between aspects of countries’ assessment systems and achievement on the Programme for International Student Assessment (PISA) tests. Cambridge Assessment Research Report. Cambridge, UK: Cambridge Assessment. Hattie J. Visible Learning;; a synthesis of over 800 meta-­analyses relating to achievement (London;; Routledge, 2009 Pg 16). Harlen W, (2005) Trusting teachers’ judgement: research evidence of the reliability and validity of teachers’ assessment used for summative purposes, Research. Papers in Education, 20:3, 245-­270, DOI: 10.1080/02671520500193744 Harlen W, Louise Hayward, Gordon Stobart. 2008 Changing Assessment Practice, Process, Principles and Standards (Assessment Reform Group). ISBN: 9780853899297 Galloway A 2008 QSA. The assessment systems of Queensland and Finland. Research report 1. 2009. Anne Galloway April 2008 Klenowski V 2013. Towards improved public understanding of judgement practice in standards referenced assessment: an Australian Perspective, Oxfor Review of Education, 39:1, 36-­51, DOI: 10.1080/03054985.2013.764759 Laukannen R 2008 Quoted in Michael le Cordeur Pg 146 Looney JW et al. Integrating Formative and Summative Assessment. Progress towards a seamless sytem? OECD Education Working Papers N0. 58. OECD Publishing. Mansell, W., James, M. & the Assessment Reform Group (2009) Assessment in schools. Fit for purpose? A Commentary by the Teaching and Learning Research Programme. London: Economic and Social Research Council, Teaching and Learning Research Programme. Marcenaro-­Gutierrez O & Anna Vignoles (2015) A comparison of teacher and test-­based assessment for Spanish primary and secondary students, Educational Research, 57:1, 1-­21, DOI: 10.1080/00131881.2014.983720.

___________________________________________________________________ ColegauCymru / CollegesWales Page 70 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Scott F M. Peck B. Raymond J. 2011. Year-­to-­year comparability of results in Queensland studies authority senior secondary courses that include school-­based moderated assessments: an ACACA sponsored review. 2011. McLeod T, Davidson M, et al. 2012. A review of the Queensland Core Skills (QCS) Test to ascertain the ongoing relevance of the test and the capability of the test to act as a statistical scaling device in the calculation of Overall Positions (OPs) and Field Positions (FPs) for tertiary selection. Enterprising Minds Pty Ltd McMahon S and Jones I 2015. A comparative judgement apparoaach to teacher assessment, Assessment in Education:Principles, Policy and Practice, 22:3, 368-­389, DOI:10.1080/0969594x.2014.978839 Nusche, Deborrah et al. 2012) “Student assessment” in OECD Review of Evaluation and Assessment in Education: New Zealand, 2011. OECD Publishing Nusche, Deborrah et al. 2011) “Student assessment” in OECD Review of Evaluation and Assessment in Education: Sweden, 2011. OECD Publishing Nusche, Deborrah et al. 2011) “Student assessment” in OECD Review of Evaluation and Assessment in Education: Norway, 2011. OECD Publishing National Center for Educational Excellence -­ Center on International Student Benchmarking, S Korea Overview, Canada Overview

NFER. Submission to Expert Group: Issues to Consider When Developing a National Monitoring System, from National Foundation for Educational Research. Version 2, December 2009. Nightingale Julie. Focus on Finnish Assessment, (Chartered Institute of Educational Assessors (CIEA 2016) Oxford University Centre For Educational Assessment. Review Of Teacher Assessment: Evidence Of What Works Best And Issues For Development. Gordon Stanley, Robert MacCann, John Gardner, Laura Reynolds and Imogen Wild. Commissioned by the QCA 2009. OECD (2013), Synergies for Better Learning: An International Perspective on Evaluation and Assessment, OECD Reviews of Evaluation and Assessment in Education, OECD Publishing, Paris, http://dx.doi.org/10.1787/9789264190658-­en OECD(2015) Education at a Glance. OECD Indicators. Pg 486 OECD (2015) Improving Schools in Scotland, an OECD perspective. OECD Publishing. Sahlberg P Finnish Lessons. What can the world learn from the educational change in Finland. Series on School Reform 2015

___________________________________________________________________ ColegauCymru / CollegesWales Page 71 of 71 Teacher-­Led Assessment in post-­14 education & training: A literature review March 2016

Shewbridge, Claire et al. 2012) “Student assessment” in OECD Review of Evaluation and Assessment in Education: Luxembourg, 2012. OECD Publishing Santiago, Paulo et al. 2011) “Student assessment” in OECD Review of Evaluation and Assessment in Education: Australia, 2011. OECD Publishing Santiago, Paulo et al. 2012) “Student assessment” in OECD Review of Evaluation and Assessment in Education: Portugal, 2012. OECD Publishing Wilmut John and Tuson Jennifer, 2005. Statistical moderation of teacher assessments. A report Commissioned by the Qualifications and Curriculum Authority. Education at a Glance 2015: OECD Indicators © OECD 2015. Table D6.1c. 2015 Upper Secondary Examinations