2.77 Sparse Principal Components Analysis(Pro)

Summary

This Sparse Principal Components Analysis app performs sparse pca on multi-variate data sets.

Sparse principal component analysis is a variant of PCA. While PCA find principal components which are linear combination of all input variables, Sparse PCA improved to select principal components whose linear combinations that contains only a few input variables. Thus the tool is useful in exploring structure and patterns in data.

Tutorial

  1. With a worksheet window activated, select menu Data: Connect to File: Text/CSV to import the sample file <Origin program folder>\Samples\Statistics\Protein Consumption in Europe.dat

  1. Click the Sparse Principal Components Analysis icon SPCA Icon.png from Apps Gallery to open the dialog
  2. In the Input tab, select column B~ J to be Input Data, select column A to be Observations
    SPCA Input.png
  3. In the Settings tab, clear Mini Batch check box, we will perform Sparse PCA on data, set Number of Components to Extract to be 4
    • Mini-batch sparse PCA is a variant of Sparse PCA that is faster but less accurate. We have a small dataset, it is fine to choose Sparse PCA instead of Mini-batch sparse PCA
    • To define the proper number of components to be extracted, we have two methods
      • Refer to the decision of normal PCA tool: Please refer to the tutorial for how to do
      • Observe the Cumulative(%) value from Adjusted Variance table get in this tool. We can change parameter to change the number of extracted components until we get the largest Cumulative(%) value
    Spca settings.png
  4. In the Plots tab, set the Component Plot Type to be 3D and set the 1st, 2nd, 3rd component to be "1", "2" and "3". Click OK button to apply settings and close dialog
    Spca plots.png

Interpreting The Results