This is the first in a short series of posts on our school’s emerging principles of assessment, which are split into three categories – principles of assessment design; principles of ethics and fairness; and principles for improving reliability and validity. My hope in sharing these principles of assessment is to help other develop greater assessment literacy, and to gain constructive feedback on our work to help us improve and refine our model in the future.
In putting together these assessment principles and an accompanying CPD programme aimed at middle leaders, I have drawn heavily on a number of writers and speakers on assessment, notably Dylan Wiliam, Daniel Koretz, Daisy Christodolou, Rob Coe and Stuart Kime. All of these have a great ability to convey difficult concepts (I only got a C grade in maths, after all) in a clear, accessible and, most importantly, practical way. I would very much recommend following up their work to deepen your understanding of what truly makes great assessment.
- Align assessments with the curriculum
In many respects, this first principle seems pretty obvious. I doubt many teachers deliberately set out to create and administer assessments that are not aligned with their curriculum. And yet, for a myriad of different reasons, this does seem to happen, with the result that students sit assessments that are not directly sampling the content and skills of the intended curriculum. In these cases the results achieved, and the ability to draw any useful inferences from them, are largely redundant. If the assessment is not assessing the things that were supposed to have been taught, it is almost certainly a waste of time – not only for the students sitting the test, but for the teachers marking it as well.
Several factors can affect the extent to which an assessment is aligned with the curriculum and are important considerations for those responsible for setting assessments. The first is the issue of accountability. Where accountability is unreasonably high and a culture of fear exists, those writing assessments might be tempted to narrow down the focus to cover the ‘most important’ or ‘most visible’ knowledge and skills that drive that accountability. In such cases, assessment ceases to provide any useful inferences about knowledge and understanding.
Assessment can also become detached from the curriculum when that curriculum is not delineated clearly enough from the outset. If there is not a coherent, well-sequenced articulation of the knowledge and skills that students are to learn, then any assessment will always be misaligned, however hard someone tries to make the purpose of the assessment valid. A clear, well structured and shared understanding of the intended curriculum is vital for the enacted curriculum to be successful, and for any assessment of individual and collective attainment to be purposeful.
A final explanation for the divorce of curriculum from assessment is the knowledge and understanding of the person writing the assessment in the first place. To write an assessment that can produce valid inferences requires a solid understanding of the curriculum aims, as well as the most valid and reliable means of assessing them. Speaking for myself, I know that I have got a lot better at writing assessments that are properly aligned with curriculum the more I have understood the links between the two and how to go about bridging them.
- Define the purpose of an assessment first
Depending on how you view it, there are essentially two main functions of assessment. The first, and probably most important, purpose is as a formative tool to support teaching and learning in the classroom. Examples might include a teacher setting a diagnostic test at the beginning of a new unit to find out what students already know so their teaching can be adapted accordingly. Formative assessment, or responsive teaching, is an integral part of teaching and learning and should be used to identify potential gaps in understanding or misconceptions that can be subsequently addressed.
The second main function of assessment is summative. Whereas examination bodies certify student achievement, in the school context the functions of summative assessment might include assigning students to different groupings based upon perceived attainment, providing inferences to support the reporting of progress home to parents, or the identification of areas of underperformance in need of further support. Dylan Wiliam separates out this accountability function from the summative process, calling it the ‘evaluative’ purpose.
Whether the assessment is designed to support summative or formative inferences is not really the point. What matters here is that the purpose or function of the assessment is made clear to all and that the inferences the assessment is intended to produce are widely understood by all. In this sense, the function of the assessment determines its form. A class test intended to diagnose student understanding of recently taught material will likely look very different from a larger scale summative assessment designed to draw inferences about whether knowledge and skills have been learnt over a longer period of time. Form therefore follows function.
3. Include items that test understanding across the construct continuum
Many of us think about assessment in the reductive terms of specific questions or units, as if performance on question 1 of Paper 2 was actually a thing worthy of study in and of itself. Assessment should be about approximating student competence in the constructs of the curriculum. A construct can be defined as the abstract conception of a trait or characteristic, such as mathematical or reading ability. Direct constructs measure tangible physical traits like height and weight and are calculated using verifiable methods and stated units of measurement. Unfortunately for us teachers, most educational assessment assesses indirect constructs that cannot be directly measured by such easily understood units. Instead, they are calculated by questions that we think indicate competency, and that stand in for the thing that we cannot measure directly.
Within many indirect constructs, such as writing or reading ability, is likely to be a continuum of achievement possible. So within the construct of reading, for instance, some students will be able to read with greater fluency and/or understanding than others. A good summative assessment therefore needs to differentiate between these differing levels of performance and, through the questions set, define what it means to be at the top, middle or bottom of that continuum. In this light, one of the functions of assessment has to be a way of estimating the position of learners on a continuum. We need to know this to evaluate the relative impact or efficacy of our curricula, and to understand how are students are progressing within it.
- Include items that reflect the types of construct knowledge
Some of the assessments we use do not adequately reflect the range of knowledge and skills of the subjects they are assessing. Perhaps the format of terminal examinations has had too much negative influence on the way we think about our subjects and design assessments for them. In my first few years of teaching, I experienced considerable cognitive dissonance between my understanding of English and the way that it was conceived of within the profession. I knew my own education was based on reading lots of books, and then lots more books about those books, but everything I was confronted with as a new teacher – schemes of work, the literacy strategy, the national curriculum, exam papers– led me to believe that I should really be thinking of English in terms of skills like inference, deduction and analysis.
English is certainly not alone here, with history, geography and religious studies all suffering from a similar identify crisis. This widespread misconception of what constitutes expertise and how that expertise is gained probably explains, at least in part, why so many schools have been unable to envisage a viable alternative to levels. Like me, many of the people responsible for creating something new themselves been infected by errors from the past and have found it difficult to see clearly that one of the big problems with levels was the way they misrepresented the very nature of subjects. And if you don’t fully understand or appreciate what progression looks like in your subject, any assessment you design will be flawed.
Daisy Christodoulou’s Making Good Progress is a helpful corrective, in particular her deliberate practice model of skill acquisition, which is extremely useful in explaining the manner in which different types of declarative and procedural knowledge can go into perfecting a more complex overarching skill. Similarly, Michael Fordham’s many posts on substantive and disciplinary knowledge, and how these might be mapped on to a history progression model are both interesting and instructive. Kris Boulton’s series of posts (inspired by some of Michael’s previous thinking) are also well worth a look. They consider the extent to which different subjects contain more substantive or disciplinary knowledge, and are useful points of reference for those seeking to understand how best to conceive of their subject and, in turn, design assessments that assess the range of underlying forms of knowledge.
- Use the most appropriate format for the purpose of the assessment
The format of an assessment should be determined by its purpose. Typically, subjects are associated with certain formats. So, in English essay tasks are quite common, whilst in maths and science, short exercises where there are right and wrong answers are more the norm. But as Dylan Wiliam suggests, although ‘it is common for different kinds of approaches to be associated with different subjects…there is no reason why this should be so.’ Wiliam draws a useful distinction between two modes of assessment: a marks for style approach (English, history, PE, Art, etc.), where students gain marks for how well they complete a task, and a degree of difficulty approach (maths, science), where students gain marks for how well they progress in a task. It is entirely possible for subjects like English to employ marks for difficulty assessment tasks, such as multiple choice questions, and maths to set marks for style assessments, as this example of comparative judgement in maths clearly demonstrates.
In most cases, the purpose of assessment in the classroom will be formative and so designed to facilitate improvements to student learning. In such instances, where the final skill has not yet been perfected but is still very much a work in progress, it is unlikely that the optimal interim assessment format will be the same as the final assessment format. For example, a teacher who sets out to teach her students by the end of the year to construct well written, logical and well supported essays is unlikely to set essays every time she wants to infer her students’ progress towards that desired end goal. Instead, she will probably set short comprehension questions to check their understanding of the content that will go into the essay, or administer tests on their ability to deploy sequencing vocabulary effectively. In each of these cases, the assessment reflects the inferences about student understanding the teacher is trying to ascertain, and not confusing or conflating them with other things.
In the next post, I will outline our principles of assessment in relation to ethics and fairness. As I have repeatedly made clear, my intention is to help contribute towards a better understanding of assessment within the profession. I welcome anyone who wants to comment on our principles, or to critique anything that I have written, since this will help me to get a better understanding of assessment myself, and make sure the assessments that we ask our students to sit are as purposeful as possible.
Thanks for reading.