论语义信息
On Semantic Information
摘要
香农早在上世纪中提出了语法、语义、语用三个层面表达信息:但由于当时对语义没有找到很好的数学刻画方法,导致目前信息技术一直在语法层面表示信息:信息技术停留在信号的感知、传输和处理层面,缺乏对信号内容直接获取、传输和处理的理论方法.从信号到内容始终存在语义鸿沟.信号可以用数学函数表示,形成了以香农信息论、奈奎斯特采样定理、傅里叶变换方法三座基石的信号处理理论.而信号语义和内容至今还没有完好的数学表示,导致难以跨越语义鸿沟,更谈不上语义信息处理.如何跨越这个鸿沟,一直是计算、信息和智能领域共同研究的课题.当今社会进入智能时代,人机共处场景已近来临.特别是当前随着语义通信的概念和技术方法的兴起,如何让智能机器理解好信号内容是智能科技中的关键.很多工科高校、科研院所的学者和大企业开发者对语义通信产生了浓厚的兴趣:但从专业角度出发,当前有关语义信息的概念非常不清晰,没有建立统一公认的语义信息定义和刻画,甚至有错误的观点,更没有对信号内容的数学刻画:本文讨论了信息的内涵,对语义信息的基本概念,语义信息物理产生过程、语义信息刻画和度量,基于语义的信号信息表示、压缩、以及信号内容的数学刻画等给出了清晰的定义和明确计算方法.希望形成语义信息处理理论,深化和夯实智能通信和AI技术的理论基础
Abstract
Claude Elwood Shannon proposed the expression of information in terms of syntax, semantics, and pragmatics as early as the 20th century. However, due to the lack of a good mathematical characterization of semantics at that time, information technology has remained focused on the syntactic level. As a result, information technology has primarily dealt with signal perception, transmission, and processing, lacking direct theoretical methods for acquiring, transmitting, and processing the content of signals. A semantic gap has always existed between the signal and its content. Signals can be represented by mathematical functions, forming the foundation of signal processing theory based on Shannons information theory, Nyquist sampling theorem, and Fourier transform methods. However, the semantics and content of signals have not yet been well mathematically represented, making it difficult to bridge the semantic gap, let alone process semantic information. Bridging this gap has been a common research topic across the fields of computing, information, and intelligence. As society enters the era of intelligence, human-machine interaction scenarios are approaching. In particular, with the rise of semantic communication concepts and technologies, enabling intelligent machines to understand signal content has become a key issue in intelligent technology. Many scholars from engineering universities, research institutes, and developers from major enterprises have shown strong interest in semantic communication. However, from a professional perspective, the concept of semantic information remains unclear, lacking a unified and accepted definition and characterization, and there are even incorrect viewpoints. Moreover, there is no mathematical characterization of signal content. This paper discusses the connotations of information, provides clear definitions and specific calculation methods for the basic concept of semantic information, the physical generation process of semantic information, the characterization and measurement of semantic information, as well as semantic-based signal representation, compression, and the mathematical characterization of signal content. It aims to form a theory of semantic information processing to deepen and strengthen the theoretical foundation of intelligent communication and AI technology.关键词
语义信息/语义通信/语义刻画/语义压缩/信号内容Key words
Semantic information/ Semantic communication/ Semantic characterization/ Semantic compression/ Signal content引用本文复制引用
石光明,高大化.论语义信息[EB/OL].(2024-12-31)[2026-04-05].https://chinaxiv.org/abs/202501.00001.学科分类
信息科学、信息技术
评论