Artículo
Machine learning techniques to discover genes with potential prognosis role in Alzheimer’s disease using different biological sources
Autor/es | Martínez Ballesteros, María del Mar
García Heredia, José Manuel Nepomuceno Chamorro, Isabel de los Ángeles Riquelme Santos, José Cristóbal |
Departamento | Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos |
Fecha de publicación | 2017 |
Fecha de depósito | 2022-04-26 |
Publicado en |
|
Resumen | Alzheimer’s disease is a complex progressive neurodegenerative brain disorder, being its prevalence ex pected to rise over the next decades. Unconventional strategies for elucidating the genetic mechanisms
are necessary ... Alzheimer’s disease is a complex progressive neurodegenerative brain disorder, being its prevalence ex pected to rise over the next decades. Unconventional strategies for elucidating the genetic mechanisms are necessary due to its polygenic nature. In this work, the input information sources are five: a public DNA microarray that measures expression levels of control and patient samples, repositories of known genes associated to Alzheimer’s disease, additional data, Gene Ontology and finally, a literature review or expert knowledge to validate the results. As methodology to identify genes highly related to this disease, we present the integration of three machine learning techniques: particularly, we have used decision trees, quantitative association rules and hierarchical cluster to analyze Alzheimer’s disease gene expres sion profiles to identify genes highly linked to this neurodegenerative disease, through changes in their expression levels between control and patient samples. We propose an ensemble of decision trees and quantitative association rules to find the most suitable configurations of the multi-objective evolutionary algorithm GarNet, in order to overcome the complex parametrization intrinsic to this type of algorithms. To fulfill this goal, GarNet has been executed using multiple configuration settings and the well-known C4.5 has been used to find the minimum accuracy to be satisfied. Then, GarNet is rerun to identify de pendencies between genes and their expression levels, so we are able to distinguish between healthy individuals and Alzheimer’s patients using the configurations that overcome the minimum threshold of accuracy defined by C4.5 algorithm. Finally, a hierarchical cluster analysis has been used to validate the obtained gene-Alzheimer’s Disease associations provided by GarNet. The results have shown that the ob tained rules were able to successfully characterize the underlying information, grouping relevant genes for Alzheimer Disease. The genes reported by our approach provided two well defined groups that per fectly divided the samples between healthy and Alzheimer’s Disease patients. To prove the relevance of the obtained results, a statistical test and gene expression fold-change were used. Furthermore, this rel evance has been summarized in a volcano plot, showing two clearly separated and significant groups of genes that are up or down-regulated in Alzheimer’s Disease patients. A biological knowledge integration phase was performed based on the information fusion of systematic literature review, enrichment Gene Ontology terms for the described genes found in the hippocampus of patients. Finally, a validation phase with additional data and a permutation test is carried out, being the results consistent with previous studies. |
Agencias financiadoras | Ministerio de Ciencia Y Tecnología (MCYT). España Junta de Andalucía |
Identificador del proyecto | TIN2011-28956-C02-02
TIN2014-55894-C2-1-R P11-TIC-7528 |
Cita | Martínez Ballesteros, M.d.M., García Heredia, J.M., Nepomuceno Chamorro, I.d.l.Á. y Riquelme Santos, J.C. (2017). Machine learning techniques to discover genes with potential prognosis role in Alzheimer’s disease using different biological sources. Information Fusion, 36 (July 2017), 114-129. |
Ficheros | Tamaño | Formato | Ver | Descripción |
---|---|---|---|---|
Machine learning techniques to ... | 4.229Mb | [PDF] | Ver/ | |