Assessment tasks for our students predominantly elicit extended complex responses. The marking of these involves making qualitative judgements, and then, typically, assigning a numerical score. The scores are applied in a linear fashion, as incremental measurements. However, the increments are not equidistant from each other. For example, the implications of the interval between 48% and 51% differ significantly from the interval between 58% and 61%.
As such, there appear to be holes in this ‘tapestry of assessment marking’. This contentious presentation argues that the process of assessment moderation, where teams engaged aim for consensus on (a) matters of judgement; (b) what constitutes quality, and (c) how quality can be represented, is questionable. Such moderation is a retrospective approach: a narrow process of quality review (Gillis, 2020), rather than quality assurance. A concomitant consequence of consensus review by academic teams is associated with high workload, as this activity is repeated for each subsequent assessment.
This is not to suggest that subjective judgements are unsubstantiated opinions: these can most assuredly be “soundly based, [and] consistently trustworthy” (Sadler 2012, p. 14). Quality, though, is an abstract concept. Multiple criteria are involved in judging and reporting quality (sometimes in fixed sets, as in rubrics). For example, student work can be deemed outstanding … but for reasons not listed in the criteria.
Determining grades that reflect student levels of achievement relies on peer agreement regarding the quality of the assessment design (responsible as it is for the raw evidence of achievement produced by the student), and the associated marking criteria communicated to the students. Ipso facto, academic discussion around the meaning and significance of quality, and what is deemed to count as evidence, needs to come before the design of the assessment, the articulation of marking criteria, and of teaching approaches. Sadler suggests that “Whereas moderation relevant for a single assessment task is repeated for subsequent tasks, the ultimate objective is the development of ‘calibrated’ academics” (2012, p. 17) – resulting in a situation where academics can produce grades without the need for third-party confirmation. Such an approach towards ‘calibrated academics’ has repercussions with regard to the distribution of workload (particularly for casual staff), but also multiple benefits: a stronger, peer-reviewed curriculum approach, taught by academics confident in not only their informed judgement but that of their colleagues; and above all, transparency of academic standards for peers and students alike.
Gillis, S. (2020). Ensuring comparability of qualifications through moderation: implications for Australia’s VET sector. Journal of Vocational Education & Training, 1-23. https://doi.org/10.1080/13636820.2020.1860116
Sadler, D. R. (2013). Assuring academic achievement standards: from moderation to calibration. Assessment in Education: Principles, Policy & Practice, 20(1), 5-19. https://doi.org/10.1080/0969594X.2012.714742