首页|On the Performance of an Explainable Language Model on PubMedQA

On the Performance of an Explainable Language Model on PubMedQA

来源：

英文摘要

Large language models (LLMs) have shown significant abilities in retrieving medical knowledge, reasoning over it and answering medical questions comparably to physicians. However, these models are not interpretable, hallucinate, are difficult to maintain and require enormous compute resources for training and inference. In this paper, we report results from Gyan, an explainable language model based on an alternative architecture, on the PubmedQA data set. The Gyan LLM is a compositional language model and the model is decoupled from knowledge. Gyan is trustable, transparent, does not hallucinate and does not require significant training or compute resources. Gyan is easily transferable across domains. Gyan-4.3 achieves SOTA results on PubmedQA with 87.1% accuracy compared to 82% by MedPrompt based on GPT-4 and 81.8% by Med-PaLM 2 (Google and DeepMind). We will be reporting results for other medical data sets - MedQA, MedMCQA, MMLU - Medicine in the future.

作者：Venkat Srinivasan、Vishaal Jatav、Anushka Chandrababu、Geetika Sharma

作者单位：

学科分类：医学现状、医学发展医学研究方法

推荐引用：Venkat Srinivasan,Vishaal Jatav,Anushka Chandrababu,Geetika Sharma.On the Performance of an Explainable Language Model on PubMedQA[EB/OL].(2025-04-07)[2025-05-07].https://arxiv.org/abs/2504.05074.点此复制

On the Performance of an Explainable Language Model on PubMedQA

On the Performance of an Explainable Language Model on PubMedQA

评论