An Approach to LLM-based Code Completion for Python

News

02 August, 2019 OS DAY-2019. Cooperation among operating platform developers and the security of Russian software

10 April, 2019 Ivannikov Memorial Workshop has been supported by IEEE

14 March, 2019 The annual Ivannikov Memorial Workshop will take place on 13-14 September 2019

An Approach to LLM-based Code Completion for Python

Volkov M.V. (SPbU, St Petersburg, Russia)
Bozhnyuk A.S. (SPbU, St Petersburg, Russia)
Vasina D.V. (ITMO University, St Petersburg, Russia)
Vasilyev V.A. (DSU, Dubna, Moscow Region, Russia)
Tropin N.V. (SPbU, St Petersburg, Russia)
Nikitin M.B. (ITMO University, St Petersburg, Russia)
Koznov D.V. (SPbU, St Petersburg, Russia)

Abstract

Code completion is a critical feature in integrated development environments (IDEs) that boosts developer productivity by suggesting relevant code snippets. Large Language Models (LLMs) are a promising approach to implement this functionality. However, existing LLMs face difficulties dealing with project-wide contexts. This paper introduces a novel approach for Repository-Level Code Completion for Python using the IDE Code Model and RAG. The method constructs an LLM prompt consisting of two components: the In-File Context, which integrates type information and relevant snippets from the current file, and the Repository Context, which retrieves pertinent code from the broader project via specialized retrievers. Experiments conducted on the RepoEval-Updated benchmark using two LLMs (CodeGemma-2b and Qwen-2.5-Coder-14b) showed significant improvements with Exact Match (EM) improving by up to 30% for CodeGemma-2b and 7% for Qwen-2.5-Coder-14b comparing with state-of-the-art approaches No-RAG, Shifted-RAG and GraphCoder. Ablation studies confirmed the In-File Context’s critical role and the advantage of combining retrievers, especially for API-Level tasks. These findings demonstrate the value of incorporating Python-specific code semantics into LLM-based completion systems. The method’s generalizability across LLM sizes suggests broad applicability.

Keywords

Repository-Level Code Completion; Large Language Model (LLM); Retrieval Augmented Generation (RAG); Integrated Development Environment (IDE); Code Model.

Edition

Proceedings of the Institute for System Programming, vol. 38, issue 3, part 1, 2026, pp. 153-170

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

DOI: 10.15514/ISPRAS-2026-38(3)-9

For citation

Volkov M.V., Bozhnyuk A.S., Vasina D.V., Vasilyev V.A, Tropin N.V., Nikitin M.B., Koznov D.V. An Approach to LLM-based Code Completion for Python. Proceedings of the Institute for System Programming, vol. 38, issue 3, part 1, 2026, pp. 153-170 DOI: 10.15514/ISPRAS-2026-38(3)-9.

Full text of the paper in pdf

Back to the contents of the volume

На нашем сайте мы используем cookie файлы, содержащие информацию о предыдущих посещениях веб-сайта. Данные обрабатываются для улучшения качества работы нашего веб-сайта. Если вы не хотите использовать cookie файлы, измените настройки браузера.

Понятно