首页|Exploring Efficient Learning of Small BERT Networks with LoRA and DoRA

Exploring Efficient Learning of Small BERT Networks with LoRA and DoRA

来源：

英文摘要

While Large Language Models (LLMs) have revolutionized artificial intelligence, fine-tuning LLMs is extraordinarily computationally expensive, preventing smaller businesses and research teams with limited GPU resources from engaging with new research. Hu et al and Liu et al introduce Low-Rank Adaptation (LoRA) and Weight-Decomposed Low-Rank Adaptation (DoRA) as highly efficient and performant solutions to the computational challenges of LLM fine-tuning, demonstrating huge speedups and memory usage savings for models such as GPT-3 and RoBERTa. We seek to expand upon the original LoRA and DoRA papers by benchmarking efficiency and performance of LoRA and DoRA when applied to a much smaller scale of language model: our case study here is the compact minBERT model. Our findings reveal that optimal custom configurations of LoRA and DoRA, coupled with Automatic Mixed Precision (AMP), significantly enhance training efficiency without compromising performance. Furthermore, while the parameterization of minBERT is significantly smaller than GPT-3, our results validate the observation that gradient updates to language models are inherently low-rank even in small model space, observing that rank 1 decompositions yield negligible performance deficits. Furthermore, aided by our highly efficient minBERT implementation, we investigate numerous architectures, custom loss functions, and hyperparameters to ultimately train an optimal ensembled multitask minBERT model to simultaneously perform sentiment analysis, paraphrase detection, and similarity scoring.

作者：Daniel Frees、Aditri Bhagirath、Moritz Bolling

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Daniel Frees,Aditri Bhagirath,Moritz Bolling.Exploring Efficient Learning of Small BERT Networks with LoRA and DoRA[EB/OL].(2025-08-25)[2025-09-06].https://arxiv.org/abs/2508.17586.点此复制

Exploring Efficient Learning of Small BERT Networks with LoRA and DoRA

Exploring Efficient Learning of Small BERT Networks with LoRA and DoRA

评论