Partial least squares regression

Because both the X and Y data are projected to new spaces, the PLS family of methods are known as bilinear factor models.

Partial least squares discriminant analysis (PLS-DA) is a variant used when the Y is categorical.

PLS is used to find the fundamental relations between two matrices (X and Y), i.e. a latent variable approach to modeling the covariance structures in these two spaces.

PLS regression is particularly suited when the matrix of predictors has more variables than observations, and when there is multicollinearity among X values.

Although the original applications were in the social sciences, PLS regression is today most widely used in chemometrics and related areas.

that maximizes the covariance[4] Note below, the algorithm is denoted in matrix notation.

A number of variants of PLS exist for estimating the factor and loading matrices T, U, P and Q.

PLS is composed of iteratively repeating the following steps k times (for k components): PLS1 is a widely used algorithm appropriate for the vector Y case.

In 2002 a new method was published called orthogonal projections to latent structures (OPLS).

In OPLS, continuous variable data is separated into predictive and uncorrelated (orthogonal) information.

[12] Similarly, OPLS-DA (Discriminant Analysis) may be applied when working with discrete variables, as in classification and biomarker studies.

The general underlying model of OPLS is or in O2-PLS[13] Another extension of PLS regression, named L-PLS for its L-shaped matrices, connects 3 related data blocks to improve predictability.

In 2015 partial least squares was related to a procedure called the three-pass regression filter (3PRF).

[15] Supposing the number of observations and variables are large, the 3PRF (and hence PLS) is asymptotically normal for the "best" forecast implied by a linear latent factor model.

In stock market data, PLS has been shown to provide accurate out-of-sample forecasts of returns and cash-flow growth.

[16] A PLS version based on singular value decomposition (SVD) provides a memory efficient implementation that can be used to address high-dimensional problems, such as relating millions of genetic markers to thousands of imaging features in imaging genetics, on consumer-grade hardware.

Typically, PLSC divides the data into two blocks (sub-groups) each containing one or more variables, and then uses singular value decomposition (SVD) to establish the strength of any relationship (i.e. the amount of shared information) that might exist between the two component sub-groups.

[22] It does this by using SVD to determine the inertia (i.e. the sum of the singular values) of the covariance matrix of the sub-groups under consideration.

Core Idea of PLS. The loading vectors in the input and output space are drawn in red (not normalized for better visibility). When increases (independent of ), and increase.
Geometric interpretation of the deflation step in the input space