首页|evalSmarT: An LLM-Based Framework for Evaluating Smart Contract Generated Comments

evalSmarT: An LLM-Based Framework for Evaluating Smart Contract Generated Comments

来源：

英文摘要

Smart contract comment generation has gained traction as a means to improve code comprehension and maintainability in blockchain systems. However, evaluating the quality of generated comments remains a challenge. Traditional metrics such as BLEU and ROUGE fail to capture domain-specific nuances, while human evaluation is costly and unscalable. In this paper, we present \texttt{evalSmarT}, a modular and extensible framework that leverages large language models (LLMs) as evaluators. The system supports over 400 evaluator configurations by combining approximately 40 LLMs with 10 prompting strategies. We demonstrate its application in benchmarking comment generation tools and selecting the most informative outputs. Our results show that prompt design significantly impacts alignment with human judgment, and that LLM-based evaluation offers a scalable and semantically rich alternative to existing methods.

作者：Fatou Ndiaye Mbodji

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Fatou Ndiaye Mbodji.evalSmarT: An LLM-Based Framework for Evaluating Smart Contract Generated Comments[EB/OL].(2025-07-28)[2025-08-10].https://arxiv.org/abs/2507.20774.点此复制

evalSmarT: An LLM-Based Framework for Evaluating Smart Contract Generated Comments

evalSmarT: An LLM-Based Framework for Evaluating Smart Contract Generated Comments

评论