Publication Information
Authors:
- Owen Henkel,
- Hannah Horne-Robinson,
- Josh McGrane,
Pages:
- 282-303
Keywords:
- Artificial Intelligence, Digital Education and Educational Technology, Automatic Speech Recognition, Educational assessment, Foundational Literacy
Abstract:
- This paper reports on a set of three recent experiments utilizing large-scale speech models to assess the oral reading fluency (ORF) of students in Ghana. While ORF is a well-established measure of foundational literacy, assessing it typically requires one-on-one sessions between a student and a trained rater, a process that is time-consuming and costly. Automating the assessment of ORF could support better literacy instruction, particularly in education contexts where formative assessment is uncommon due to large class sizes and limited resources. This research is among the first to examine the use of the most recent versions of large-scale speech models for ORF assessment in the Global South.We find that the best performing model, Whisper V2, with no additional fine-tuning, produces transcriptions of Ghanaian students reading aloud with a Word Error Rate of 10.3. When these transcriptions are used to produce fully automated ORF scores, they closely align with scores generated by expert human raters, with a correlation coefficient of 0.98. These results were achieved on a representative dataset (i.e., students with regional accents, recordings taken in actual classrooms), using a free and publicly available speech with no additional fine-tuning. This model’s strong performance on real-world classroom data, combined with its accessibility and simplified implementation, suggests potential for scaling ORF assessment in lower-resource, linguistically diverse educational contexts.