Volume 31, Issue 1 pp. 2-13

A Framework for Evaluation and Use of Automated Scoring

David M. Williamson

David M. Williamson

David M. Williamson, Xiaoming Xi, and F. Jay Breyer, Educational Testing Service, Rosedale Road, Princeton, NJ 08541; [email protected] .

Search for more papers by this author
Xiaoming Xi

Xiaoming Xi

David M. Williamson, Xiaoming Xi, and F. Jay Breyer, Educational Testing Service, Rosedale Road, Princeton, NJ 08541; [email protected] .

Search for more papers by this author
F. Jay Breyer

F. Jay Breyer

David M. Williamson, Xiaoming Xi, and F. Jay Breyer, Educational Testing Service, Rosedale Road, Princeton, NJ 08541; [email protected] .

Search for more papers by this author
First published: 22 March 2012
Citations: 204

Abstract

A framework for evaluation and use of automated scoring of constructed-response tasks is provided that entails both evaluation of automated scoring as well as guidelines for implementation and maintenance in the context of constantly evolving technologies. Consideration of validity issues and challenges associated with automated scoring are discussed within the framework. The fit between the scoring capability and the assessment purpose, the agreement between human and automated scores, the consideration of associations with independent measures, the generalizability of automated scores as implemented in operational practice across different tasks and test forms, and the impact and consequences for the population and subgroups are proffered as integral evidence supporting use of automated scoring. Specific evaluation guidelines are provided for using automated scoring to complement human scoring for tests used for high-stakes purposes. These guidelines are intended to be generalizable to new automated scoring systems and as existing systems change over time.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.