|国家预印本平台
首页|SNVformer: An Attention-based Deep Neural Network for GWAS Data

SNVformer: An Attention-based Deep Neural Network for GWAS Data

SNVformer: An Attention-based Deep Neural Network for GWAS Data

来源:bioRxiv_logobioRxiv
英文摘要

Abstract Despite being the widely-used gold standard for linking common genetic variations to phenotypes and disease, genome-wide association studies (GWAS) suffer major limitations, partially attributable to the reliance on simple, typically linear, models of genetic effects. More elaborate methods, such as epistasis-aware models, typically struggle with the scale of GWAS data. In this paper, we build on recent advances in neural networks employing Transformer-based architectures to enable such models at a large scale. As a first step towards replacing linear GWAS with a more expressive approximation, we demonstrate prediction of gout, a painful form of inflammatory arthritis arising when monosodium urate crystals form in the joints under high serum urate conditions, from Single Nucleotide Variants (SNVs) using a scalable (long input) variant of the Transformer architecture. Furthermore, we show that sparse SNVs can be efficiently used by these Transformer-based networks without expanding them to a full genome. By appropriately encoding SNVs, we are able to achieve competitive initial performance, with an AUROC of 83% when classifying a balanced test set using genotype and demographic information. Moreover, the confidence with which the network makes its prediction is a good indication of the prediction accuracy. Our results indicate a number of opportunities for extension, enabling full genome-scale data analysis using more complex and accurate genotype-phenotype association models.

Tan Ne?et ?zkan、Witbrock Michael、Benavides-Prado Diana、Gavryushkin Alex、Sumpter Nicholas、Nguyen Trung Bao、Leask Megan、Elmes Kieran

Strong AI Lab, School of Computer Science, The University of AucklandStrong AI Lab, School of Computer Science, The University of AucklandStrong AI Lab, School of Computer Science, The University of AucklandBiological Data Science Lab, School of Mathematics and Statistics, University of CanterburyUniversity of Alabama at BirminghamStrong AI Lab, School of Computer Science, The University of AucklandUniversity of Alabama at BirminghamBiological Data Science Lab, School of Mathematics and Statistics, University of Canterbury||Department of Computer Science, University of Otago

10.1101/2022.07.07.499217

医学研究方法生物科学研究方法、生物科学研究技术计算技术、计算机技术

Tan Ne?et ?zkan,Witbrock Michael,Benavides-Prado Diana,Gavryushkin Alex,Sumpter Nicholas,Nguyen Trung Bao,Leask Megan,Elmes Kieran.SNVformer: An Attention-based Deep Neural Network for GWAS Data[EB/OL].(2025-03-28)[2025-06-23].https://www.biorxiv.org/content/10.1101/2022.07.07.499217.点此复制

评论