|国家预印本平台
首页|GitHub Proxy Server: A tool for supporting massive data collection on GitHub

GitHub Proxy Server: A tool for supporting massive data collection on GitHub

GitHub Proxy Server: A tool for supporting massive data collection on GitHub

来源:Arxiv_logoArxiv
英文摘要

GitHub is the most popular social coding platform and widely used by developers and organizations to host their open-source projects around the world. Besides that, the platform has a web API that allow developers collect information from public repositories hosted on it. However, collecting massive amount of data from GitHub can be very challenging due to existing restrictions and abuse detection mechanisms. In this work, we present a tool, called GitHub Proxy Server, which abstracts such complexities into a tool that is independent on operational system and programming language. We show that, using the proposed tool, it is possible to improve the performance of GitHub mining tasks without any additional complexities.

Hudson Silva Borges、Marco Tulio Valente

10.1145/3555228.3555276

计算技术、计算机技术

Hudson Silva Borges,Marco Tulio Valente.GitHub Proxy Server: A tool for supporting massive data collection on GitHub[EB/OL].(2025-05-23)[2025-06-14].https://arxiv.org/abs/2505.18305.点此复制

评论