|国家预印本平台
首页|Build It Clean: Large-Scale Detection of Code Smells in Build Scripts

Build It Clean: Large-Scale Detection of Code Smells in Build Scripts

Build It Clean: Large-Scale Detection of Code Smells in Build Scripts

来源:Arxiv_logoArxiv
英文摘要

Build scripts are files that automate the process of compiling source code, managing dependencies, running tests, and packaging software into deployable artifacts. These scripts are ubiquitous in modern software development pipelines for streamlining testing and delivery. While developing build scripts, practitioners may inadvertently introduce code smells. Code smells are recurring patterns of poor coding practices that may lead to build failures or increase risk and technical debt. The goal of this study is to aid practitioners in avoiding code smells in build scripts through an empirical study of build scripts and issues on GitHub. We employed a mixed-methods approach, combining qualitative and quantitative analysis. We conducted a qualitative analysis of 2000 build-script-related GitHub issues. Next, we developed a static analysis tool, Sniffer, to identify code smells in 5882 build scripts of Maven, Gradle, CMake, and Make files, collected from 4877 open-source GitHub repositories. We identified 13 code smell categories, with a total of 10,895 smell occurrences, where 3184 were in Maven, 1214 in Gradle, 337 in CMake, and 6160 in Makefiles. Our analysis revealed that Insecure URLs were the most prevalent code smell in Maven build scripts, while Hardcoded Paths/URLs were commonly observed in both Gradle and CMake scripts. Wildcard Usage emerged as the most frequent smell in Makefiles. The co-occurrence analysis revealed strong associations between specific smell pairs of Hardcoded Paths/URLs with Duplicates, and Inconsistent Dependency Management with Empty or Incomplete Tags, indicating potential underlying issues in the build script structure and maintenance practices. Based on our findings, we recommend strategies to mitigate the existence of code smells in build scripts to improve the efficiency, reliability, and maintainability of software projects.

Mahzabin Tamanna、Yash Chandrani、Matthew Burrows、Brandon Wroblewski、Laurie Williams、Dominik Wermke

计算技术、计算机技术

Mahzabin Tamanna,Yash Chandrani,Matthew Burrows,Brandon Wroblewski,Laurie Williams,Dominik Wermke.Build It Clean: Large-Scale Detection of Code Smells in Build Scripts[EB/OL].(2025-06-22)[2025-07-19].https://arxiv.org/abs/2506.17948.点此复制

评论