17.7.2 Partial Least Squares

1 Goals
2 Processing Procedure
- 2.1 Preparing Analysis Data
- 2.2 Selecting Computation Methods
  - 2.2.1 SVD or Wold's Iteration
  - 2.2.2 Cross Validation
3 Handling Missing Values
4 Performing Partial Least Squares

Partial Least Squares(PLS) combines features of principal components analysis and multiple regression. It first extracts a set of latent factors that explain as much of the covariance as possible between the independent and dependent variables. Then a regression step predicts values of the dependent variables using the decomposition of the independent variables.

Goals

There are two primary reasons for using PLS:

Prediction
PLS is a popular method for constructing a predictive model when the factors are many and highly collinear.
Data Reduction
PLS is used to convert a set of highly correlated variables to a set of independent variables

Processing Procedure

Preparing Analysis Data

PLS can be used for variables which are strongly correlated. Since PLS is considered as the combination of PCA and Multiple Regression, the data used for PCA can be addressed with PLS.

Selecting Computation Methods

SVD or Wold's Iteration

These two methods yield the same result, the difference being that Wold's Iteration is slightly faster than SVD. SIMPLS is a method named in some papers. SIMPLS is simply SVD or Wold's Iteration by another name.

Cross Validation

Verification of the fitting model is an important step. Cross Validation allows us to evaluate the performance of our model. Origin uses the leave-one-out method of cross validation. In Origin, predicted residual sum of squares(PRESS) and its root mean are used to find the optimal number of factors by cross-validation.