Time- and Learner-Dependent Hidden Markov Model for Writing Process Analysis Using Keystroke Log Data

Publication Information


  • Masaki Uto, The University of Electro-Communications
  • Yoshimitsu Miyazawa, The National Center for University Entrance Examinations
  • Yoshihiro Kato, Benesse Educational Research and Development Institute
  • Koji Nakajima, Benesse Educational Research and Development Institute
  • Hajime Kuwata, Benesse Educational Research and Development Institute


  • 271-298


  • Writing skills, Writing process, Keystroke log, Hidden Markov model, Markov chain Monte Carlo method


  • Teaching writing strategies based on writing processes has attracted wide attention as a method for developing writing skills. The writing process can be generally defined as a sequence of subtasks, such as planning, formulation, and revision. Therefore, instructor feedback is often given based on sequence patterns of those subtasks. For such feedback, instructors need to analyze sequence patterns for all learners, which becomes problematic as the number of learners increases. To resolve this problem, this study proposes a new machine-learning method that estimates sequence patterns from keystroke log data. Specifically, we propose an extension of the Gaussian hidden Markov model that incorporates parameters representing temporal change in a subtask appearance distribution for each learner. Furthermore, we propose a collapsed Gibbs sampling algorithm as the parameter estimation method for the proposed model. We demonstrate effectiveness of the proposed model by applying it to actual keystroke log datasets.