|国家预印本平台
首页|Building Entity Association Mining Framework for Knowledge Discovery

Building Entity Association Mining Framework for Knowledge Discovery

Building Entity Association Mining Framework for Knowledge Discovery

来源:Arxiv_logoArxiv
英文摘要

Extracting useful signals or pattern to support important business decisions for example analyzing investment product traction and discovering customer preference, risk monitoring etc. from unstructured text is a challenging task. Capturing interaction of entities or concepts and association mining is a crucial component in text mining, enabling information extraction and reasoning over and knowledge discovery from text. Furthermore, it can be used to enrich or filter knowledge graphs to guide exploration processes, descriptive analytics and uncover hidden stories in the text. In this paper, we introduce a domain independent pipeline i.e., generalized framework to enable document filtering, entity extraction using various sources (or techniques) as plug-ins and association mining to build any text mining business use-case and quantitatively define a scoring metric for ranking purpose. The proposed framework has three major components a) Document filtering: filtering documents/text of interest from massive amount of texts b) Configurable entity extraction pipeline: include entity extraction techniques i.e., i) DBpedia Spotlight, ii) Spacy NER, iii) Custom Entity Matcher, iv) Phrase extraction (or dictionary) based c) Association Relationship Mining: To generates co-occurrence graph to analyse potential relationships among entities, concepts. Further, co-occurrence count based frequency statistics provide a holistic window to observe association trends or buzz rate in specific business context. The paper demonstrates the usage of framework as fundamental building box in two financial use-cases namely brand product discovery and vendor risk monitoring. We aim that such framework will remove duplicated effort, minimize the development effort, and encourage reusability and rapid prototyping in association mining business applications for institutions.

Anshika Rawal、Abhijeet Kumar、Mridul Mishra

财政、金融计算技术、计算机技术

Anshika Rawal,Abhijeet Kumar,Mridul Mishra.Building Entity Association Mining Framework for Knowledge Discovery[EB/OL].(2025-06-02)[2025-06-17].https://arxiv.org/abs/2506.01451.点此复制

评论