News
An Approach to LLM-based Code Completion for Python
Abstract
Code completion is a critical feature in integrated development environments (IDEs) that boosts developer productivity by suggesting relevant code snippets. Large Language Models (LLMs) are a promising approach to implement this functionality. However, existing LLMs face difficulties dealing with project-wide contexts. This paper introduces a novel approach for Repository-Level Code Completion for Python using the IDE Code Model and RAG. The method constructs an LLM prompt consisting of two components: the In-File Context, which integrates type information and relevant snippets from the current file, and the Repository Context, which retrieves pertinent code from the broader project via specialized retrievers. Experiments conducted on the RepoEval-Updated benchmark using two LLMs (CodeGemma-2b and Qwen-2.5-Coder-14b) showed significant improvements with Exact Match (EM) improving by up to 30% for CodeGemma-2b and 7% for Qwen-2.5-Coder-14b comparing with state-of-the-art approaches No-RAG, Shifted-RAG and GraphCoder. Ablation studies confirmed the In-File Context’s critical role and the advantage of combining retrievers, especially for API-Level tasks. These findings demonstrate the value of incorporating Python-specific code semantics into LLM-based completion systems. The method’s generalizability across LLM sizes suggests broad applicability.
Keywords
Edition
Proceedings of the Institute for System Programming, vol. 38, issue 3, part 1, 2026, pp. 153-170
ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).
DOI: 10.15514/ISPRAS-2026-38(3)-9
For citation
Full text of the paper in pdf
Back to the contents of the volume