Predictive Student Modeling in Game-Based Learning Environments with Word Embedding Representations of Reflection

Publication Information


  • Michael Geden, North Carolina State University
  • Andrew Emerson, North Carolina State University
  • Dan Carpenter, North Carolina State University
  • Jonathan P Rowe, North Carolina State University
  • Roger Azevedo, McGill University
  • James Lester, North Carolina State University


  • 1-23


  • Student modeling, Early prediction, Game-based learning environments, Self-regulated learning, Reflection


  • Game-based learning environments are designed to provide effective and engaging learning experiences for students. Predictive student models use trace data extracted from students’ in-game learning behaviors to unobtrusively generate early assessments of student knowledge and skills, equipping game-based learning environments with the capacity to anticipate student outcomes and proactively deliver adaptive scaffolding or notify instructors. Reflection is a key component of self-regulated learning, and it is critical in effective learning. However, there is currently limited work exploring the utility of reflection for inducing accurate predictive student models. This article presents a predictive student modeling framework that leverages natural language responses to in-game reflection prompts to predict student learning outcomes in a game-based learning environment for middle school microbiology, CRYSTAL ISLAND. With data from a pair of classroom studies involving 118 middle school students, we investigate the accuracy of early prediction models that utilize features extracted from student trace data combined with word embedding-based representations (i.e., GloVe, ELMo) of student reflection responses. We evaluate the accuracy of the predictive models over time using data from incremental segments of each student’s interaction with the game-based learning environment, and we compare against models that omit student reflection features. Results reveal that models encoding students’ natural language reflections with ELMo word embeddings yield significantly improved accuracy compared to other representations, with the greatest accuracy demonstrated by an ensemble of predictive models. We discuss the implications of these results for the design of game-based learning environments.