Master's Final Project
Modelo PLS
Author/s | Espejo Alonso, Lucía |
Director | Pino Mejías, Rafael |
Department | Universidad de Sevilla. Departamento de Estadística e Investigación Operativa |
Publication Date | 2017-06 |
Deposit Date | 2017-07-26 |
Academic Title | Universidad de Sevilla. Máster Universitario en Matemáticas |
Abstract | The Partial Least Squares approach (PLS) is a multivariate technique which was originated around 1975 by Herman Wold for the modelling of complicated data sets in terms of chains of matrices (blocks), the so-called path ... The Partial Least Squares approach (PLS) is a multivariate technique which was originated around 1975 by Herman Wold for the modelling of complicated data sets in terms of chains of matrices (blocks), the so-called path model. This included a simple but efficient way to estimate the parameters in these models called NIPALS (non-linear iterative partial least squares). Around 1980, the simplest PLS model with two blocks (X, set of predictor variables, and Y, set of response variables) was slightly modified by his son Svante Wold and Hararld Martes to better suit to data from science and technology, and it was proved to be useful to deal with complicated data sets where ordinary regression was difficult or impossible to apply. Partial Least Squares Regression (PLSR) solves the problem that arises when there are many predictor variables with an extreme dependency relation (multicollinearity problem). For this, PLSR finds a set of new variables which are created as a linear combination of the original variables, taking into account the response variables, so that the multicollinearity problem is eliminated. The Principal Component Regression (PCR) also solves the problem of multicollinearity, but, unlike PLSR, it only takes into account the predictor variables to create the new variables. Nowaday, PLSR has great utility to model problems associated with market research, economics, chemistry, biology, communication or medicine, among others. This work is structured as follows: in chapter 2 a review of the principal component regression is made. In Chapter 3 the partial least squares regression is introduced. Chapter 4 describes the NIPALS algorithm as well as the properties that are deduced from it. In addition, the interpretations are shown and the regression coefficients are deduced. In chapter 5 other alternative algorithms are detailed. Finally, Chapter 6 describes the R package which implements both PCR and PLSR, and several examples are shown. |
Citation | Espejo Alonso, L. (2017). Modelo PLS. (Trabajo Fin de Máster Inédito). Universidad de Sevilla, Sevilla. |
Files | Size | Format | View | Description |
---|---|---|---|---|
Espejo Alonso Lucía TFG.pdf | 1.210Mb | [PDF] | View/ | |