基于主数据的企业结构化数据鉴定、归档与保存研究
Research on Identification, Archiving and Preservation of Enterprise Structured Business Data Based on Master Data Management
企业数字化转型需要对数据进行全生命周期管理,数据鉴定归档是破解大数据技术难以直接处理传统非结构化文档问题的重要手段。在企业数据治理基础上,将主数据管理引入到数据归档中,将企业数据分为主数据、事务数据、分析数据三类。运用宏观鉴定法对这三类数据进行鉴定,确定数据归档范围,将ER图、数据字典、数据血缘关系图谱等作为元数据纳入元数据归档范围,以归档数据子湖融入到企业数据湖建设作为数据归档保存最佳路径。档案部门可以采取实行电子文件与数据归档“双套制”、大型国企先行试点数据归档、提升档案工作团队的数据素养主动参与数据治理等方式加快融入到国家大数据战略。
igital transformation of enterprises requires the whole-life cycle?management of data. Data identification and archiving is an important means to solve the problem of traditional unstructured documents that are difficult to be directly processed by big data technology. Based on enterprise data governance, master data management is introduced into data archiving, and enterprise data is divided into three categories: master data, transaction data, and analytical data. Using macro identification method to identify these three types of data. Determining?the scope of data archiving, and include metadata such as ER diagrams, data dictionaries, and data lineage diagrams into the scope of metadata archiving, integrating archived data into the construction of enterprise data lake as the best path for data archiving. The archives department can accelerate integration into the national big data strategy by implementing a "dual system" of electronic file and data archiving, piloting data archiving in large state-owned enterprises, and actively participating in data governance to enhance the data literacy of the archives work team.
王茜、魏楠、李泽锋、马雯
计算技术、计算机技术自动化技术经济经济计划、经济管理
主数据数据归档数据鉴定数据湖
Master dataData fillingData appraisalData lake
王茜,魏楠,李泽锋,马雯.基于主数据的企业结构化数据鉴定、归档与保存研究[EB/OL].(2023-08-14)[2025-08-02].https://chinaxiv.org/abs/202308.00151.点此复制
评论