Evaluating Integrative Strategies for Incorporating Phenotypic Features in Spatial Transcriptomics
Evaluating Integrative Strategies for Incorporating Phenotypic Features in Spatial Transcriptomics
Spatial transcriptomics (ST) technologies not only offer an unprecedented opportunity to interrogate intact biological samples in a spatially informed manner, but also set the stage for integration with other imaging-based modalities. However, how to best exploit spatial context and integrate ST with imaging-based modalities remains an open question. To address this, particularly under real-world experimental constraints such as limited dataset size, class imbalance, and bounding-box-based segmentation, we used a publicly available murine ileum MERFISH dataset to evaluate whether a minimally tuned variational autoencoder (VAE) could extract informative low-dimensional representations from cell crops of spot counts, nuclear stain, membrane stain, or a combination thereof. We assessed the resulting embeddings through PERMANOVA, cross-validated classification, and unsupervised Leiden clustering, and compared them to classical image-based feature vectors extracted via CellProfiler. While transcript counts (TC) generally outperformed other feature spaces, the VAE-derived latent spaces (LSs) captured meaningful biological variation and enabled improved label recovery for specific cell types. LS2, in particular, trained solely on morphological input, also exhibited moderate predictive power for a handful of genes in a ridge regression model. Notably, combining TC with LSs through multiplex clustering led to consistent gains in cluster homogeneity, a trend that also held when augmenting only subsets of TC with the stain-derived LS2. In contrast, CellProfiler-derived features underperformed relative to LSs, highlighting the advantage of learned representations over hand-crafted features. Collectively, these findings demonstrate that even under constrained conditions, VAEs can extract biologically meaningful signals from imaging data and constitute a promising strategy for multi-modal integration.
Levin M Moser、Ahmad Kamal Hamid、Esteban Miglietta、Nodar Gogoberidze、Beth A Cimini
生物科学研究方法、生物科学研究技术细胞生物学分子生物学
Levin M Moser,Ahmad Kamal Hamid,Esteban Miglietta,Nodar Gogoberidze,Beth A Cimini.Evaluating Integrative Strategies for Incorporating Phenotypic Features in Spatial Transcriptomics[EB/OL].(2025-07-29)[2025-08-06].https://arxiv.org/abs/2507.22212.点此复制
评论