Large Language Model-Powered Agent for C to Rust Code Translation
Large Language Model-Powered Agent for C to Rust Code Translation
The C programming language has been foundational in building system-level software. However, its manual memory management model frequently leads to memory safety issues. In response, a modern system programming language, Rust, has emerged as a memory-safe alternative. Moreover, automating the C-to-Rust translation empowered by the rapid advancements of the generative capabilities of LLMs is gaining growing interest for large volumes of legacy C code. Despite some success, existing LLM-based approaches have constrained the role of LLMs to static prompt-response behavior and have not explored their agentic problem-solving capability. Applying the LLM agentic capability for the C-to-Rust translation introduces distinct challenges, as this task differs from the traditional LLM agent applications, such as math or commonsense QA domains. First, the scarcity of parallel C-to-Rust datasets hinders the retrieval of suitable code translation exemplars for in-context learning. Second, unlike math or commonsense QA, the intermediate steps required for C-to-Rust are not well-defined. Third, it remains unclear how to organize and cascade these intermediate steps to construct a correct translation trajectory. To address these challenges in the C-to-Rust translation, we propose a novel intermediate step, the Virtual Fuzzing-based equivalence Test (VFT), and an agentic planning framework, the LLM-powered Agent for C-to-Rust code translation (LAC2R). The VFT guides LLMs to identify input arguments that induce divergent behaviors between an original C function and its Rust counterpart and to generate informative diagnoses to refine the unsafe Rust code. LAC2R uses the MCTS to systematically organize the LLM-induced intermediate steps for correct translation. We experimentally demonstrated that LAC2R effectively conducts C-to-Rust translation on large-scale, real-world benchmarks.
HoHyun Sim、Hyeonjoong Cho、Yeonghyeon Go、Zhoulai Fu、Ali Shokri、Binoy Ravindran
计算技术、计算机技术
HoHyun Sim,Hyeonjoong Cho,Yeonghyeon Go,Zhoulai Fu,Ali Shokri,Binoy Ravindran.Large Language Model-Powered Agent for C to Rust Code Translation[EB/OL].(2025-05-20)[2025-06-05].https://arxiv.org/abs/2505.15858.点此复制
评论