|国家预印本平台
首页|Tell, Don't Show: Leveraging Language Models' Abstractive Retellings to Model Literary Themes

Tell, Don't Show: Leveraging Language Models' Abstractive Retellings to Model Literary Themes

Tell, Don't Show: Leveraging Language Models' Abstractive Retellings to Model Literary Themes

来源:Arxiv_logoArxiv
英文摘要

Conventional bag-of-words approaches for topic modeling, like latent Dirichlet allocation (LDA), struggle with literary text. Literature challenges lexical methods because narrative language focuses on immersive sensory details instead of abstractive description or exposition: writers are advised to "show, don't tell." We propose Retell, a simple, accessible topic modeling approach for literature. Here, we prompt resource-efficient, generative language models (LMs) to tell what passages show, thereby translating narratives' surface forms into higher-level concepts and themes. By running LDA on LMs' retellings of passages, we can obtain more precise and informative topics than by running LDA alone or by directly asking LMs to list topics. To investigate the potential of our method for cultural analytics, we compare our method's outputs to expert-guided annotations in a case study on racial/cultural identity in high school English language arts books.

Li Lucy、Camilla Griffiths、Sarah Levine、Jennifer L. Eberhardt、Dorottya Demszky、David Bamman

信息传播、知识传播科学、科学研究教育

Li Lucy,Camilla Griffiths,Sarah Levine,Jennifer L. Eberhardt,Dorottya Demszky,David Bamman.Tell, Don't Show: Leveraging Language Models' Abstractive Retellings to Model Literary Themes[EB/OL].(2025-05-29)[2025-06-22].https://arxiv.org/abs/2505.23166.点此复制

评论