Clúster Hadoop sobre Rapberry Pi con datos de acelerómetro, giroscopio y sensor de temperatura

Gómez González, Hernán Adrián

Trabajo Fin de Grado

dc.contributor.advisor	Sierra Collado, Antonio Jesús	es
dc.creator	Gómez González, Hernán Adrián	es
dc.date.accessioned	2021-04-13T16:01:44Z
dc.date.available	2021-04-13T16:01:44Z
dc.date.issued	2020
dc.identifier.citation	Gómez González, H.A. (2020). Clúster Hadoop sobre Rapberry Pi con datos de acelerómetro, giroscopio y sensor de temperatura. (Trabajo Fin de Grado Inédito). Universidad de Sevilla, Sevilla.
dc.identifier.uri	https://hdl.handle.net/11441/107051
dc.description.abstract	Vivimos en un mundo conectado donde la cantidad de información que generamos es mayor día a día. Toda esta cantidad de datos contiene información muy valiosa queantes estaba oculta, pero ahora, podemos encontrarla y usarla en el progreso de la humanidad. La herramamienta que nos permite almacenar, procesar y analizar estos datos es Big Data. La relevanciadeesta tecnología es cada vez mayor, y sus posibilidades son casi infinitas.Lo podemos encontrar en las redes sociales, internet, investigaciones científicas, empresasy otros cientos de lugares.Big Data está en todas partes, y sus ventajas son considerables. Una de las tecnologías más usadas en el mundo deBig Data es Apache Hadoop. Hadoop nos permitirá realizar las tareas de Big Data de forma muy cómoda, ofreciendo un amplio ecosistema.Por una parte, con Hadoop podremos almacenar los datos de un manera totalmente fiable y tolerante a fallos. Por otra parte, Hadoop nos permite procesar todos esos datos de una forma sencilla. Cuando hablamos deprocesardatos, nos referimos a que nos permite procesar terabytes de información, por ello es una herramienta muy potente.Para permitir todo esto, Hadoop posee tres componentesprincipalesque son HDFS, MapReduce y Yarn.Por tanto, además de explicar qué es Big Data y ver algunas de sus aplicaciones, en este trabajo nos centraremos en Hadoop. Veremos en detalle cada uno de sus componentes y cómo funcionan. Además de explicar cómo funciona teóricamente, también realizaremos una demostración práctica. Paraello instalaremos Hadoop en un clúster de Raspberry Pi. Este clúster estará compuesto por cuatro Raspberry Pi 3 model B+ que permitirán ejecutar Hadoop sin problema. Detallaremos los componentes necesarios, cuálha sido el proceso de instalación,así comolas herramientas que nos ofrece.Hadoop posee un amplio ecosistema en el que cada una de sus diversas aplicaciones ofrece una solución para los diferentes problemas de Big Data. Una de esas herramientas es Apache Pig. Pig nos permite realizar las tareas MapReduce de una forma sencilla y rápida. MapReduce es el framework usado por Hadoop para el procesamiento de datos, por lo que su importancia es vital. De esta forma, también instalaremos Pig en nuestro clúster Hadoop para realizar las tareas MapReduce. Veremos su funcionamiento y profundizaremos en el lenguaje que utiliza Pig para el procesamiento de los datos, el cual es Pig Latin. Estudiaremos todas las operaciones más importantes de PigLatiny realizaremos un ejemplo con cada una de estas operaciones usando el clúster Hadoop.No podremos usar Hadoop ni Pig sin unos datos que poder almacenar y procesar. Para ello se usará un sensor MPU6050. Este sensor ofrece un acelerómetro, giroscopio y sensor de temperatura. Los datos capturados por este sensor serán almacenados en un fichero de texto el cual podremos almacenar en Hadoop y procesar con Pig.Por tanto, en este trabajo, demostraremos que es posible instalar Hadoop en uno de los ordenadores más baratos y sencillos que existen, como son las Raspberry Pi. Almacenaremos los datos obtenidos durante un período de tiempo por un sensor MPU6050 usando HDFS, el cual es el sistema de almacenamiento de Hadoop. Y, por último, procesaremos estos datos usando Pig, con lo que veremos las operaciones más importantes que posee y su funcionamiento.	es
dc.description.abstract	We live in a connected world where the information that we generate is greater day by day. This amount of data contains a very valuable information that was hidden, but now, we can find it and use it in the progress of humanity. The Big Data is the tool that allows us store, process and analyze this data. The relevance of this technology is increasing, and its possibilities are almost endless. It can be found in social networks, the internet, scientific researches, companies and hundreds of other places.Big Data is everywhere, and its advantages are considerable. One of the most used technologies in the Big Data world is Apache Hadoop. Hadoop will allow us to carry out Big Data tasks in a very confortable way, offering a wide ecosystem. On the one hand, Hadoop will allow us storing data in a totally reliable and fault-tolerant way. On the other hand, Hadoop allows us to process allthis data in a simple way. When we talk about processing data, we mean that it allows us to process terabytes of information, this is why it is a very powerful tool. Hadoop has three main components to manage all of this, those are three components which are HDFS, MapReduce and Yarn.Therefore, in addition to explaining what Big Data is and studying some of its applications, we will focus on Hadoop in this work. We will see in detail each of its components and how they work. In addition to explaining how it works theoretically, we will also do a practical demonstration. We will install Hadoop on a Rasperry Pi cluster for this. This cluster will be composed of four Raspberry Pi 3 model B+ that will allow Hadoop to run without any problem. We will detail the necessary components, the installation process, as well as the tools it offers us.Hadoop has a wide ecosystemin which each of its components offers a solution for different Big Data problems. One of those tools is Apache Pig. Pig allows us to perform MapReduce tasks in a simple and fast way. MapReduce is the framework used by Hadoop for data processing, so its importance is vital. In this way, we will also install Pig on our Hadoop cluster to perform MapReduce tasks. We will see how it works and we will study the language that Pig uses for data processing, which is Pig Latin. We will study all the most important Pig Latin operations and we will run an example of each of these operations using the Hadoop cluster.We will not be able to use Hadoop or Pig without any data to store and process. A MPU6050 sensor will be used for this. This sensor offers and accelerometer, gyroscope and a temperaturesensor. The data captured by this sensor will be stored in a text file which we can store in Hadoop and process with Pig.Therefore, in this work, we will demonstrate that it is posible to installHadoop on one of the cheapest and simplestcomputers that exist, such as the Raspberry Pi. We will sotre the data obtained over a period of time by the MPU6050 sensor using HDFS, which is the Hadoop storage system. And finally, we will process this data using Pig, studying the most importat operations it has and how they work.	es
dc.format	application/pdf	es
dc.language.iso	spa	es
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.title	Clúster Hadoop sobre Rapberry Pi con datos de acelerómetro, giroscopio y sensor de temperatura	es
dc.type	info:eu-repo/semantics/bachelorThesis	es
dc.type.version	info:eu-repo/semantics/publishedVersion	es
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es
dc.contributor.affiliation	Universidad de Sevilla. Departamento de Ingeniería Telemática	es
dc.description.degree	Universidad de Sevilla. Grado en Ingeniería de las Tecnologías deTelecomunicación	es
dc.publication.endPage	105 p.	es

Ficheros	Tamaño	Formato	Ver	Descripción
TFG-3336-GOMEZ GONZALEZ.pdf	3.323Mb	[PDF]	Ver/Abrir

Este registro aparece en las siguientes colecciones

Grado en Ingeniería de las Tecnologías de Telecomunicación

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional