|国家预印本平台
首页|一种基于元素语义的自动化元素拾取方法

一种基于元素语义的自动化元素拾取方法

n Automated Element Extraction Method Based on Element Semantics

中文摘要英文摘要

随着机器人流程自动化(RPA)技术的广泛应用,元素拾取作为自动化流程设计中的核心步骤,直接影响流程的执行效率和准确性。传统的元素拾取方法依赖于手动配置或基于规则的模板匹配,难以应对复杂多变的界面元素,自动化程度较低。本文提出了一种基于元素语义的自动化元素拾取方法,通过结合目标检测、图像分类、文本识别及多模态视觉语言模型,生成界面元素的语义描述,并设计了智能化的元素匹配筛选策略。实验结果表明,该方法在复杂场景下的元素拾取准确性和鲁棒性显著优于现有的多模态拾取方法,能够有效提升RPA流程设计的智能化水平。

With the widespread adoption of Robotic Process Automation (RPA) technology, element extraction, as a core step in the design of automated workflows, directly impacts the efficiency and accuracy of process execution. Traditional element extraction methods rely on manual configuration or rule-based template matching, which struggle to handle complex and dynamic interface elements, resulting in limited automation capabilities. This paper proposes an automated element extraction method based on element semantics, which integrates object detection, image classification, text recognition, and multimodal vision-language models to generate semantic descriptions of interface elements. Additionally, an intelligent element matching and filtering strategy is designed. Experimental results demonstrate that the proposed method significantly outperforms existing multimodal extraction methods in terms of accuracy and robustness in complex scenarios, effectively enhancing the intelligence level of RPA workflow design.

李文生、潘海

自动化技术、自动化技术设备

人工智能流程自动化元素拾取多模态模型

rtificial IntelligenceProcess AutomationElement ExtractionMultimodal Models

李文生,潘海.一种基于元素语义的自动化元素拾取方法[EB/OL].(2025-02-24)[2025-08-23].http://www.paper.edu.cn/releasepaper/content/202502-81.点此复制

评论