|国家预印本平台
首页|Clarifying orthography: Orthographic transparency as compressibility

Clarifying orthography: Orthographic transparency as compressibility

Clarifying orthography: Orthographic transparency as compressibility

来源:Arxiv_logoArxiv
英文摘要

Orthographic transparency -- how directly spelling is related to sound -- lacks a unified, script-agnostic metric. Using ideas from algorithmic information theory, we quantify orthographic transparency in terms of the mutual compressibility between orthographic and phonological strings. Our measure provides a principled way to combine two factors that decrease orthographic transparency, capturing both irregular spellings and rule complexity in one quantity. We estimate our transparency measure using prequential code-lengths derived from neural sequence models. Evaluating 22 languages across a broad range of script types (alphabetic, abjad, abugida, syllabic, logographic) confirms common intuitions about relative transparency of scripts. Mutual compressibility offers a simple, principled, and general yardstick for orthographic transparency.

Charles J. Torres、Richard Futrell

语言学

Charles J. Torres,Richard Futrell.Clarifying orthography: Orthographic transparency as compressibility[EB/OL].(2025-05-19)[2025-07-09].https://arxiv.org/abs/2505.13657.点此复制

评论