Innamark: A Whitespace Replacement Information-Hiding Method
Innamark: A Whitespace Replacement Information-Hiding Method
Large language models (LLMs) have gained significant popularity in recent years. Differentiating between a text written by a human and one generated by an LLM has become almost impossible. Information-hiding techniques such as digital watermarking or steganography can help by embedding information inside text in a form that is unlikely to be noticed. However, existing techniques, such as linguistic-based or format-based methods, change the semantics or cannot be applied to pure, unformatted text. In this paper, we introduce a novel method for information hiding called Innamark, which can conceal any byte-encoded sequence within a sufficiently long cover text. This method is implemented as a multi-platform library using the Kotlin programming language, which is accompanied by a command-line tool and a web interface. By substituting conventional whitespace characters with visually similar Unicode whitespace characters, our proposed scheme preserves the semantics of the cover text without changing the number of characters. Furthermore, we propose a specified structure for secret messages that enables configurable compression, encryption, hashing, and error correction. An experimental benchmark comparison on a dataset of 1 000 000 Wikipedia articles compares ten algorithms. The results demonstrate the robustness of our proposed Innamark method in various applications and the imperceptibility of its watermarks to humans. We discuss the limits to the embedding capacity and robustness of the algorithm and how these could be addressed in future work.
Hendrik Norkowski、Ernst-Christoph Schrewe、Haydar Qarawlus、Falk Howar、Malte Hellmeier
计算技术、计算机技术
Hendrik Norkowski,Ernst-Christoph Schrewe,Haydar Qarawlus,Falk Howar,Malte Hellmeier.Innamark: A Whitespace Replacement Information-Hiding Method[EB/OL].(2025-08-21)[2025-09-02].https://arxiv.org/abs/2502.12710.点此复制
评论