Interpretable Oracle Bone Script Decipherment through Radical and Pictographic Analysis with LVLMs
Interpretable Oracle Bone Script Decipherment through Radical and Pictographic Analysis with LVLMs
As the oldest mature writing system, Oracle Bone Script (OBS) has long posed significant challenges for archaeological decipherment due to its rarity, abstractness, and pictographic diversity. Current deep learning-based methods have made exciting progress on the OBS decipherment task, but existing approaches often ignore the intricate connections between glyphs and the semantics of OBS. This results in limited generalization and interpretability, especially when addressing zero-shot settings and undeciphered OBS. To this end, we propose an interpretable OBS decipherment method based on Large Vision-Language Models, which synergistically combines radical analysis and pictograph-semantic understanding to bridge the gap between glyphs and meanings of OBS. Specifically, we propose a progressive training strategy that guides the model from radical recognition and analysis to pictographic analysis and mutual analysis, thus enabling reasoning from glyph to meaning. We also design a Radical-Pictographic Dual Matching mechanism informed by the analysis results, significantly enhancing the model's zero-shot decipherment performance. To facilitate model training, we propose the Pictographic Decipherment OBS Dataset, which comprises 47,157 Chinese characters annotated with OBS images and pictographic analysis texts. Experimental results on public benchmarks demonstrate that our approach achieves state-of-the-art Top-10 accuracy and superior zero-shot decipherment capabilities. More importantly, our model delivers logical analysis processes, possibly providing archaeologically valuable reference results for undeciphered OBS, and thus has potential applications in digital humanities and historical research. The dataset and code will be released in https://github.com/PKXX1943/PD-OBS.
Kaixin Peng、Mengyang Zhao、Haiyang Yu、Teng Fu、Bin Li
语言学汉语
Kaixin Peng,Mengyang Zhao,Haiyang Yu,Teng Fu,Bin Li.Interpretable Oracle Bone Script Decipherment through Radical and Pictographic Analysis with LVLMs[EB/OL].(2025-08-17)[2025-08-24].https://arxiv.org/abs/2508.10113.点此复制
评论