Subword-level grammatical error correction: a universal approach

News

02 August, 2019 OS DAY-2019. Cooperation among operating platform developers and the security of Russian software

10 April, 2019 Ivannikov Memorial Workshop has been supported by IEEE

14 March, 2019 The annual Ivannikov Memorial Workshop will take place on 13-14 September 2019

Subword-level grammatical error correction: a universal approach

Khabutdinov I.A. (NRU MIPT, Dolgoprudny, Moscow Region, Russia)
Grabovoy А.V. (NRU MIPT, Dolgoprudny, Moscow Region, Russia; ICS RAS, Moscow, Russia)
Chekhovich Yu.V. (ICS RAS, Moscow, Russia)
Kildyakov A.S. (ICS RAS, Moscow, Russia)
Ivakhnenko A.A. (ICS RAS, Moscow, Russia)

Abstract

In this study, we propose a fully automatic methodology for data generation, correction rule vocabulary construction, and Sequence Tagging model training that specifically targets Grammatical Error Correction. Our approach operates at the SentencePiece subword level, using basic transformations – keep, append, replace and delete – that are universally applicable across languages, thereby eliminating the need for grammar-specific operations. By using the Levenshtein algorithm to generate ground truth corrections and editorial prescriptions, we obtained a completely invariant and language-independent dataset generation process. We applied our method to the Sequence Tagging model GECToR and achieved comparable quality results for English with F0.5 scores of 62.4 on the CoNLL-2014 (test set) and 61.9 on the BEA-2019 (test set), without manual rule design or manual annotation of error spans/types. The results indicate that subword-level universal edits can provide a practical alternative to grammar-specific operations, while requiring only parallel correction data.

Keywords

grammar error corrections; neural language processing; transformers; machine learning.

Edition

Proceedings of the Institute for System Programming, vol. 38, issue 3, part 1, 2026, pp. 187-196

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

DOI: 10.15514/ISPRAS-2026-38(3)-11

For citation

Khabutdinov I.A., Grabovoy А.V., Chekhovich Yu.V., Kildyakov A.S., Ivakhnenko A.A. Subword-level grammatical error correction: a universal approach. Proceedings of the Institute for System Programming, vol. 38, issue 3, part 1, 2026, pp. 187-196 DOI: 10.15514/ISPRAS-2026-38(3)-11.

Full text of the paper in pdf

Back to the contents of the volume

На нашем сайте мы используем cookie файлы, содержащие информацию о предыдущих посещениях веб-сайта. Данные обрабатываются для улучшения качества работы нашего веб-сайта. Если вы не хотите использовать cookie файлы, измените настройки браузера.

Понятно