A survey on data integration for multi-omics sample clustering - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Neurocomputing Année : 2022

A survey on data integration for multi-omics sample clustering

, (1) , (2, 3) , (4) , , (5)


Due to the current high availability of omics, data-driven biology has greatly expanded, and several papers have reviewed state-of-the-art technologies. Nowadays, two main types of investigation are available for a multi-omics dataset: extraction of relevant features for a meaningful biological interpretation and clustering of the samples. In the latter case, a few reviews refer to some outdated or no longer available methods, whereas others lack the description of relevant clustering metrics to compare the main approaches. This work provides a general overview of the major techniques in this area, divided into four groups: graph, dimensionality reduction, statistical and neural-based. Besides, eight tools have been tested both on a synthetic and a real biological dataset. An extensive performance comparison has been provided using four clustering evaluation scores: Peak Signal-to-Noise Ratio (PSNR), Davies-Bouldin(DB) index, Silhouette value and the harmonic mean of cluster purity and efficiency. The best results were obtained by using the dimensionality reduction, either explicitly or implicitly, as in the neural architecture. (C) 2021 The Authors. Published by Elsevier B.V.

Dates et versions

hal-03749826 , version 1 (11-08-2022)



Marta Lovino, Vincenzo Randazzo, Gabriele Ciravegna, Pietro Barbiero, Elisa Ficarra, et al.. A survey on data integration for multi-omics sample clustering. Neurocomputing, 2022, 488, pp.494-508. ⟨10.1016/j.neucom.2021.11.094⟩. ⟨hal-03749826⟩
8 Consultations
0 Téléchargements



Gmail Facebook Twitter LinkedIn More