|国家预印本平台
首页|产业链视角下结合 K-means 和 LDA 的专利技术主题挖掘与趋势分析——以虚拟现实技术为例

产业链视角下结合 K-means 和 LDA 的专利技术主题挖掘与趋势分析——以虚拟现实技术为例

echnology Topic Mining and Trend Analysis from the Perspective of Industrial ChainCombined with K-Means and LDA——Taking Virtual Reality Technology as an Example

中文摘要英文摘要

目的 / 意义 ] 在产业链视角下,以虚拟现实技术为例,构建 VR 专利产业链语料库,挖掘中国 VR 专利的技术主题、研发热点和未来发展趋势。[ 方法 / 过程 ] 首先,利用 Python 爬取 VR 领域的专利文本,通过数据清洗得到有效语料库;然后,结合 IPC 分类号和 K-means 聚类算法,构建并验证VR 专利产业链;最后,基于 TF-IDF 算法和 LDA 主题模型,识别出产业链视角下中国 VR 专利的核心技术主题及其综合强度、技术研发热点和未来趋势。[ 结果 / 结论 ] 当前中国 VR 产业链各环节的专利比例不均衡,上游研发最热门,其次是下游应用,最薄弱的是中游制作。主题挖掘方面,上游热点为软件研发,中游热点为影视制作,下游热点为医疗、教育、娱乐应用。未来趋势方面,产业链上游将以电数字数据处理、光学元件、图像通信等技术为主流,中游将以车辆部件、动力装置、减振装置等技术为主流,下游将以室内游戏、医学诊断、鉴定等技术为主流。

Purpose/significance] From the perspective of industry chain, this paper takes virtual realitytechnology as an example, constructs VR patent industry chain corpus, and explores the technical theme,research and development hotspot and future development trend of China VR patent. [Method/process]First of all, this paper used Python to crawl the patent text in VR field and got effective corpus through datacleaning. Secondly, combining IPC classification number and K-means clustering algorithm, this paperconstructed and validates VR patent industry chain. In addition, based on TF-IDF algorithm and LDA thememodel, we identified the core technology themes and their comprehensive strength, technology research anddevelopment hotspots and future trends of China VR patents from the perspective of production chain. [Result/conclusion] At present, the proportion of patents in each link of China VR industry chain is unbalanced. Theupstream link is the most popular, followed by the downstream link, and the weakest link is the midstreamlink. In terms of theme mining, the upstream hot spot is software development, the midstream hot spotis film and television production, and the downstream hot spot is medical, educational and entertainmentapplications. In terms of future trends, the upstream of the industrial chain will be dominated by technologiessuch as electronic digital data processing, optical components, image communication, etc., the midstream willbe dominated by technologies such as vehicle components, power devices, damping devices, etc., and thedownstream will be dominated by technologies such as indoor games, medical diagnosis, identification, etc.

陈玲、林平、段尧清

dx.doi.org/10.13266/j.issn.2095-5472.2020.013

通信计算技术、计算机技术电子技术应用

K-means 聚类算法LDA 主题模型技术主题演化文本挖掘VR(虚拟现实)

K-means clustering algorithmLDA theme modeltechnology theme evolutiontext mining

陈玲,林平,段尧清.产业链视角下结合 K-means 和 LDA 的专利技术主题挖掘与趋势分析——以虚拟现实技术为例[EB/OL].(2023-10-08)[2025-08-18].https://chinaxiv.org/abs/202310.03030.点此复制

评论