Automated Deployment of a Spark Cluster with Machine Learning Algorithm Integration

Fernández, A. M.; Gutiérrez Avilés, David; Troncoso Lora, Alicia; Martínez Álvarez, Francisco

doi:10.1016/j.bdr.2020.100135

Artículo

dc.creator	Fernández, A. M.	es
dc.creator	Gutiérrez Avilés, David	es
dc.creator	Troncoso Lora, Alicia	es
dc.creator	Martínez Álvarez, Francisco	es
dc.date.accessioned	2022-04-04T07:59:58Z
dc.date.available	2022-04-04T07:59:58Z
dc.date.issued	2020
dc.identifier.citation	Fernández, A.M., Gutiérrez Avilés, D., Troncoso, A. y Martínez Álvarez, F. (2020). Automated Deployment of a Spark Cluster with Machine Learning Algorithm Integration. Big Data Research, 19-20 (art. nº100135)
dc.identifier.issn	2214-5796	es
dc.identifier.uri	https://hdl.handle.net/11441/131699
dc.description.abstract	The vast amount of data stored nowadays has turned big data analytics into a very trendy research field. The Spark distributed computing platform has emerged as a dominant and widely used paradigm for cluster deployment and big data analytics. However, to get started up is still a task that may take much time when manually done, due to the requisites that all nodes must fulfill. This work introduces LadonSpark, an open-source and non-commercial solution to configure and deploy a Spark cluster automatically. It has been specially designed for easy and efficient management of a Spark cluster with a friendly graphical user interface to automate the deployment of a cluster and to start up the distributed file system of Hadoop quickly. Moreover, LadonSpark includes the functionality of integrating any algorithm into the system. That is, the user only needs to provide the executable file and the number of required inputs for proper parametrization. Source codes developed in Scala, R, Python, or Java can be supported on LadonSpark. Besides, clustering, regression, classification, and association rules algorithms are already integrated so that users can test its usability from its initial installation.	es
dc.description.sponsorship	Ministerio de Ciencia, Innovación y Universidades TIN2017-88209-C2-1-R	es
dc.format	application/pdf	es
dc.format.extent	9	es
dc.language.iso	eng	es
dc.publisher	Elsevier	es
dc.relation.ispartof	Big Data Research, 19-20 (art. nº100135)
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Big Data analytics	es
dc.subject	Apache Spark	es
dc.subject	Machine Learning	es
dc.subject	Cluster deployment	es
dc.title	Automated Deployment of a Spark Cluster with Machine Learning Algorithm Integration	es
dc.type	info:eu-repo/semantics/article	es
dc.type.version	info:eu-repo/semantics/publishedVersion	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.contributor.affiliation	Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos	es
dc.relation.projectID	TIN2017-88209-C2-1-R	es
dc.relation.publisherversion	https://www.sciencedirect.com/science/article/pii/S2214579620300034?via%3Dihub	es
dc.identifier.doi	10.1016/j.bdr.2020.100135	es
dc.journaltitle	Big Data Research	es
dc.publication.volumen	19-20	es
dc.publication.issue	art. nº100135	es
dc.contributor.funder	Ministerio de Ciencia, Innovación y Universidades (MICINN). España	es

Ficheros	Tamaño	Formato	Ver	Descripción
Automated Deployment of a Spark ...	611.9Kb	[PDF]	Ver/Abrir

Este registro aparece en las siguientes colecciones

Artículos (Lenguajes y Sistemas Informáticos)

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional