News
Applying language models to automatically check students’ open-ended answers
Abstract
Automatic grading of short open-ended student answers simplifies teachers’ work and allows for quick and effective assessment. The goal of this study is to compare methods for classifying Russian-language short answers depending on the assessment. The authors analyzed the application of neural network language models and machine learning methods. Evaluation is based on a reference answer. The student’ answer is categorized in two classes: correct/incorrect, or three: correct/partially correct/incorrect. For the experiments, the authors collected four corpora of answers to questions from various disciplines and subject areas: a corpus of general questions on IT disciplines and higher mathematics, a corpus of questions on databases, a corpus of questions on history, and a corpus of questions on Qt development. During the experiments with these texts, 11 pre-trained language models, 2 training methods, 2 methods of splitting training and test sets, and 7 classifiers were compared to analyze various methods of vector representation and classification of Russian-language texts. An analysis of binary classification results revealed that there is no dominant model + classifier pair that consistently outperforms others across all corpora. BERT models in combination with a centroid classifier, logistic regression, or multilayer perceptron demonstrated the F-measure greater than 0.9. For ternary classification, the best combinations were rugpt3m, MiniLM-L12, and rubert-tiny2 models in combination with categorical boosting and a centroid classifier, with the F-measure of 0.58. Augmentation based on rules for recombination of real data helped to improve the F-measure to 0.96 for binary classification and to 0.91 for ternary classification. Error analysis revealed that the main difficulty is separating completely correct answers from partially correct ones. Based on the experimental results, a software system for conducting assessments among students was developed and published.
Keywords
Edition
Proceedings of the Institute for System Programming, vol. 38, issue 3, part 2, 2026, pp. 197-214
ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).
DOI: 10.15514/ISPRAS-2026-38(3)-30
For citation
Full text of the paper in pdf (in Russian)
Back to the contents of the volume