|国家预印本平台
首页|A versatile resource of 1500 diverse wild and cultivated soybean genomes for post-genomics research

A versatile resource of 1500 diverse wild and cultivated soybean genomes for post-genomics research

A versatile resource of 1500 diverse wild and cultivated soybean genomes for post-genomics research

来源:bioRxiv_logobioRxiv
英文摘要

Summary With the advance of next-generation sequencing technologies, over 15 terabytes of raw soybean genome sequencing data were generated and made available in the public. To develop a consolidated, diverse, and user-friendly genomic resource to facilitate post-genomic research, we sequenced 91 highly diverse wild soybean genomes representing the entire US collection of wild soybean accessions to increase the genetic diversity of the sequenced genomes. Having integrated and analyzed the sequencing data with the public data, we identified and annotated 32 million single nucleotide polymorphisms (32mSNPs) with a resolution of 30 SNPs/kb and 12 non-synonymous SNPs/gene in 1,556 accessions (1.5K). Population structure analysis showed that the 1.5K accessions represent the genetic diversity of the 20,087 (20K) soybean accessions in the U.S. collection. Inclusion of wild soybean genomes significantly increased the genetic diversity and shorten linkage disequilibrium distance in the panel of soybean accessions. We identified a collection of paired accessions sharing the highest genomic identity between the 1.5K and 20K accessions as genomically “equivalent” accessions to maximize the use of the genome sequences. We demonstrated that the 32mSNPs in the 1.5K accessions can be effectively used for in-silico genotyping, discovering trait QTL, gene alleles/mutations, identifying germplasms containing beneficial allele and domestication selection of trait alleles. We made the 32mSNPs and 1.5K accessions with detailed annotation available at SoyBase and Ag Data Commons. The dataset could serve as a versatile resource to release the potential of the huge amount of genome sequencing data for a variety of postgenomic research.

Jiang He、An Yong-qiang Charles、Zhang Hengyou、Hu Zhenbin、Song Qijian

Donald Danforth Plant Science CenterDonald Danforth Plant Science Center||US Department of Agriculture, Agricultural Research Service, Midwest Area, Plant Genetics Research UnitDonald Danforth Plant Science CenterDonald Danforth Plant Science CenterUS Department of Agriculture, Agricultural Research Service, Soybean Genomics and Improvement Laboratory

10.1101/2020.11.16.383950

农业科学技术发展遗传学植物学

soybeansingle nucleotide polymorphism (SNP)genetic diversitywhole-genome resequencingUS Soybean Germplasm Collectionlinkage disequilibrium (LD)genomic equivalence

Jiang He,An Yong-qiang Charles,Zhang Hengyou,Hu Zhenbin,Song Qijian.A versatile resource of 1500 diverse wild and cultivated soybean genomes for post-genomics research[EB/OL].(2025-03-28)[2025-04-30].https://www.biorxiv.org/content/10.1101/2020.11.16.383950.点此复制

评论