dc.creator | Gundogdu, Pelin | es |
dc.creator | Loucera, Carlos | es |
dc.creator | Alamo Álvarez, Inmaculada | es |
dc.creator | Dopazo, Joaquín | es |
dc.creator | Nepomuceno Chamorro, Isabel de los Ángeles | es |
dc.date.accessioned | 2022-06-30T09:23:45Z | |
dc.date.available | 2022-06-30T09:23:45Z | |
dc.date.issued | 2022 | |
dc.identifier.citation | Gundogdu, P., Loucera, C., Alamo Álvarez, I., Dopazo, J. y Nepomuceno Chamorro, I.d.l.Á. (2022). Integrating pathway knowledge with deep neural networks to reduce the dimensionality in single-cell RNA-seq data. BioData Mining, 15 (1 - art. nº1) | |
dc.identifier.issn | 1756-0381 | es |
dc.identifier.uri | https://hdl.handle.net/11441/134833 | |
dc.description.abstract | Background: Single-cell RNA sequencing (scRNA-seq) data provide valuable insights
into cellular heterogeneity which is significantly improving the current knowledge on
biology and human disease. One of the main applications of scRNA-seq data analysis
is the identification of new cell types and cell states. Deep neural networks (DNNs)
are among the best methods to address this problem. However, this performance
comes with the trade-off for a lack of interpretability in the results. In this work we
propose an intelligible pathway-driven neural network to correctly solve cell-type
related problems at single-cell resolution while providing a biologically meaningful
representation of the data.
Results: In this study, we explored the deep neural networks constrained by several
types of prior biological information, e.g. signaling pathway information, as a way to
reduce the dimensionality of the scRNA-seq data. We have tested the proposed
biologically-based architectures on thousands of cells of human and mouse origin
across a collection of public datasets in order to check the performance of the
model. Specifically, we tested the architecture across different validation scenarios
that try to mimic how unknown cell types are clustered by the DNN and how it
correctly annotates cell types by querying a database in a retrieval problem.
Moreover, our approach demonstrated to be comparable to other less interpretable
DNN approaches constrained by using protein-protein interactions gene regulation
data. Finally, we show how the latent structure learned by the network could be
used to visualize and to interpret the composition of human single cell datasets.
Conclusions: Here we demonstrate how the integration of pathways, which convey
fundamental information on functional relationships between genes, with DNNs, that
provide an excellent classification framework, results in an excellent alternative to
learn a biologically meaningful representation of scRNA-seq data. In addition, the
introduction of prior biological knowledge in the DNN reduces the size of the
network architecture. Comparative results demonstrate a superior performance of this
approach with respect to other similar approaches. As an additional advantage, the
use of pathways within the DNN structure enables easy interpretability of the results
by connecting features to cell functionalities by means of the pathway nodes, as
demonstrated with an example with human melanoma tumor cells | es |
dc.description.sponsorship | Ministerio de Ciencia e Innovación PID2020-117979RB-I00 | es |
dc.description.sponsorship | Instituto de Salud Carlos III IMP/0019 | es |
dc.description.sponsorship | European Union (UE). H2020 (MLFPM) GA 813533 | es |
dc.format | application/pdf | es |
dc.format.extent | 21 | es |
dc.language.iso | eng | es |
dc.publisher | BMC | es |
dc.relation.ispartof | BioData Mining, 15 (1 - art. nº1) | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Deep neural network | es |
dc.subject | Signaling pathway | es |
dc.subject | Single cell | es |
dc.subject | scRNA-seq | es |
dc.subject | Gene expression | es |
dc.subject | Transcriptomics | es |
dc.subject | Machine Learning | es |
dc.title | Integrating pathway knowledge with deep neural networks to reduce the dimensionality in single-cell RNA-seq data | es |
dc.type | info:eu-repo/semantics/article | es |
dcterms.identifier | https://ror.org/03yxnpp24 | |
dc.type.version | info:eu-repo/semantics/publishedVersion | es |
dc.rights.accessRights | info:eu-repo/semantics/openAccess | es |
dc.contributor.affiliation | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos | es |
dc.relation.projectID | PID2020-117979RB-I00 | es |
dc.relation.projectID | IMP/0019 | es |
dc.relation.projectID | (MLFPM) GA 813533 | es |
dc.relation.publisherversion | https://biodatamining.biomedcentral.com/articles/10.1186/s13040-021-00285-4 | es |
dc.identifier.doi | 10.1186/s13040-021-00285-4 | es |
dc.contributor.group | Universidad de Sevilla. TIC134: Sistemas Informáticos | es |
dc.journaltitle | BioData Mining | es |
dc.publication.volumen | 15 | es |
dc.publication.issue | 1 - art. nº1 | es |
dc.contributor.funder | Ministerio de Ciencia e Innovación (MICIN). España | es |
dc.contributor.funder | Instituto de Salud Carlos III | es |
dc.contributor.funder | European Union (UE). H2020 | es |