|国家预印本平台
首页|iSparse kmeans: a two-step clustering approach for big dynamic functional network connectivity data

iSparse kmeans: a two-step clustering approach for big dynamic functional network connectivity data

iSparse kmeans: a two-step clustering approach for big dynamic functional network connectivity data

来源:bioRxiv_logobioRxiv
英文摘要

Abstract BackgroundDynamic functional network connectivity (dFNC) estimated from resting-state functional magnetic imaging (rs-fMRI) studies the temporally varying of functional integration between brain networks. In a typical dFNC pipeline, a clustering stage to summarize the connectivity patterns that are transiently but reliably realized over the course of a scanning session. However, identifying the right number of clusters through a conventional clustering criterion computed by running the algorithm repeatedly, over a large range of cluster numbers is time-consuming and requires substantial computational power even for typical dFNC datasets, and the computational demands become prohibitive as datasets become larger and scans longer. Here we developed a new dFNC pipeline, called iterative sparse kmeans or iSparse kmeans, to analyze large dFNC data without having access to huge computational power. MethodIn iSparse kmeans, we implement two-step clustering. In the first step, we randomly use a sub-sample dFNC data and identify several sets of states at different model orders. In the second step, we aggregate all dFNC states estimated from all iterations in the first step and use this to identify the optimum number of clusters using the elbow criteria. Additionally, we use this new reduced dataset and estimate a final set of states by performing a second kmeans clustering on the aggregated dFNC states from the first k-means clustering. To validate the reproducibility of iSparse kmeans, we analyzed four dFNC datasets from the human connectome project (HCP). ResultsWe found that both conventional kmeans and iSparse kmeans generate similar brain dFNC states while iSparse kmeans is 27 times faster than the traditional method in finding the optimum number of clusters. We show that the results are replicated across four different datasets from HCP. ConclusionWe developed a new analytic pipeline which facilitates analysis of large dFNC datasets without having access to a huge computational power source. We validated the reproducibility of the result across multiple datasets.

Sendi Mohammad S. E.、Miller Robyn L、Salat David H、Calhoun Vince D

Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University||Department of Electrical and Computer Engineering, Georgia Institute of Technology||Tri-institutional Center for Translational Research in Neuroimaging and Data Science: Georgia State University, Georgia Institute of Technology, Emory UniversityTri-institutional Center for Translational Research in Neuroimaging and Data Science: Georgia State University, Georgia Institute of Technology, Emory University||Department of Computer Science, Georgia State UniversityHarvard Medical School||Massachusetts General HospitalWallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University||Department of Electrical and Computer Engineering, Georgia Institute of Technology||Tri-institutional Center for Translational Research in Neuroimaging and Data Science: Georgia State University, Georgia Institute of Technology, Emory University||Department of Computer Science, Georgia State University

10.1101/2022.03.13.484193

计算技术、计算机技术生物科学研究方法、生物科学研究技术

dynamic functional network connectivitykmeans clusteringiterative sparse kmeans

Sendi Mohammad S. E.,Miller Robyn L,Salat David H,Calhoun Vince D.iSparse kmeans: a two-step clustering approach for big dynamic functional network connectivity data[EB/OL].(2025-03-28)[2025-04-26].https://www.biorxiv.org/content/10.1101/2022.03.13.484193.点此复制

评论