首页|Langformers: Unified NLP Pipelines for Language Models

Langformers: Unified NLP Pipelines for Language Models

来源：

英文摘要

Transformer-based language models have revolutionized the field of natural language processing (NLP). However, using these models often involves navigating multiple frameworks and tools, as well as writing repetitive boilerplate code. This complexity can discourage non-programmers and beginners, and even slow down prototyping for experienced developers. To address these challenges, we introduce Langformers, an open-source Python library designed to streamline NLP pipelines through a unified, factory-based interface for large language model (LLM) and masked language model (MLM) tasks. Langformers integrates conversational AI, MLM pretraining, text classification, sentence embedding/reranking, data labelling, semantic search, and knowledge distillation into a cohesive API, supporting popular platforms such as Hugging Face and Ollama. Key innovations include: (1) task-specific factories that abstract training, inference, and deployment complexities; (2) built-in memory and streaming for conversational agents; and (3) lightweight, modular design that prioritizes ease of use. Documentation: https://langformers.com

作者：Rabindra Lamsal、Maria Rodriguez Read、Shanika Karunasekera

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Rabindra Lamsal,Maria Rodriguez Read,Shanika Karunasekera.Langformers: Unified NLP Pipelines for Language Models[EB/OL].(2025-04-12)[2025-05-08].https://arxiv.org/abs/2504.09170.点此复制

Langformers: Unified NLP Pipelines for Language Models

Langformers: Unified NLP Pipelines for Language Models

评论