17.7.2 Partial Least Squares
Contents
Partial Least Squares(PLS) combines features of principal components analysis and multiple regression. It first extracts a set of latent factors that explain as much of the covariance as possible between the independent and dependent variables. Then a regression step predicts values of the dependent variables using the decomposition of the independent variables.
Goals
There are two primary reasons for using PLS:
- Prediction
- PLS is a popular method for constructing a predictive model when the factors are many and highly collinear.
- Data Reduction
- PLS is used to convert a set of highly correlated variables to a set of independent variables
Processing Procedure
Preparing Analysis Data
PLS can be used for variables which are strongly correlated. Since PLS is considered as the combination of PCA and Multiple Regression, the data used for PCA can be addressed with PLS.
Selecting Computation Methods
SVD or Wold's Iteration
These two methods yield the same result, the difference being that Wold's Iteration is slightly faster than SVD. SIMPLS is a method named in some papers. SIMPLS is simply SVD or Wold's Iteration by another name.
Cross Validation
Verification of the fitting model is an important step. Cross Validation allows us to evaluate the performance of our model. Origin uses the leave-one-out method of cross validation. In Origin, predicted residual sum of squares(PRESS) and its root mean are used to find the optimal number of factors by cross-validation.
Handling Missing Values
If there are missing values in the independent/dependent variables, the whole case (entire row) will be excluded in the analysis
Performing Partial Least Squares
- Select Statistics: Multivariate Analysis: Partial Least Squares
- Or
- Type pls -d in script window
|
Topics covered in this section: |
