|国家预印本平台
首页|一种基于广义相似性的共调控基因聚类算法

一种基于广义相似性的共调控基因聚类算法

n Algorithm for Clustering Co-Regulated Genes Based on General Similarity

中文摘要英文摘要

基于模式/趋势相似性的聚类算法在共调控基因发现中的作用受到了越来越多的关注。大多数已有算法不适于发现负相关的共调控基因。对正相关的共调控基因,这些算法也会丢失某些具有重要生物意义的结果。为了解决这些问题,基于广义相似性,提出了一种新的聚类模型g-Cluster,同时聚类正负共调控基因。g-Cluster模型中,正负共调控基因因具有相同的编码而被聚集到同一个共调控基因类中。进一步,提出了一种基于树结构的聚类算法FBTD,采用先宽度优先后深度优先的搜索策略,挖掘所有符合条件的最大g-Cluster,同时应用了高效的削减规则和优化策略。将该算法用于多个真实(酵母周期数据集和白血病数据集)和人造微阵列数据集,并将结果提交Gene Ontology,理论分析和实验结果都表明,该算法是实用的、有效的,且优于已有的方法。

Recently, more and more reported studies focus on pattern/tendency-based co-regulated genes clustering, however, most of which cannot be directly applied to find the negative regulated gene clusters. Even for the positive regulated gene clusters, the pattern/tendency -based approaches may also risk missing some results which are potentially of high biological significance. In this paper, in order to group the positive and negative co-regulated genes together, we propose a novel model, namely g-Cluster, based on the general similarity, where two genes are clustered into the same co-regulated gene cluster if and only if they have the same code. Further, a tree-based clustering algorithm with several efficient pruning and optimization strategies, called FBTD, is designed to identify all maximal g-Clusters in a "First Breadth-first and Then Depth-first" manner. We conduct extensive experiments on real (e.g. Yeast and AML-ALL) and synthetic datasets and commit the results to Gene Ontology. Both theoretic analysis and experimental results show that our algorithm is effective and efficient, and outperforms the existing approaches.

赵宇海

生物科学研究方法、生物科学研究技术分子生物学

聚类共调控基因模式相似性基因本体

clusteringco-regulated genespattern similarityGene Ontology

赵宇海.一种基于广义相似性的共调控基因聚类算法[EB/OL].(2011-02-14)[2025-08-21].http://www.paper.edu.cn/releasepaper/content/201102-102.点此复制

评论