Modified Locally Linear Embedding

Main Concept¶

Locally Linear Embedding (LLE) is based on a fundamental principle: while high-dimensional data may lie on a nonlinear manifold, each small neighborhood can be approximated as linear. The method preserves the geometric properties of these local neighborhoods while reducing dimensionality.

The core idea relies on two key assumptions:

Each data point can be reconstructed from its neighbors via linear combinations
These same reconstruction relationships should be preserved in the lower-dimensional space

To understand this transformation, consider a manifold $\mathcal{M}$ embedded in $\mathbb{R}^D$ . LLE maps points $x_i \in \mathbb{R}^D$ to $y_i \in \mathbb{R}^d$ (where $d < D$ ) while preserving the local geometric relationships between neighboring points.

Theoretical Aspect¶

The method involves two sequential optimization problems:

First Optimization: Finding Reconstruction Weights
$\min_{W} \sum_{i=1}^N \left\|x_i - \sum_{j=1}^N W_{ij}x_j\right\|^2$
(1)

Subject to constraints:

$W_{ij} = 0$ if $x_j$ is not a k-nearest neighbor of $x_i$
$\sum_{j=1}^N W_{ij} = 1$ for each $i$ (affine reconstruction)

Second Optimization: Computing Lower-dimensional Embeddings
$\min_{Y} \sum_{i=1}^N \left\|y_i - \sum_{j=1}^N W_{ij}y_j\right\|^2$
(2)

Subject to constraints:

$\frac{1}{N}\sum_{i=1}^N y_i = 0$ (centered at origin)
$\frac{1}{N}\sum_{i=1}^N y_iy_i^T = I$ (unit covariance)

The key variables being optimized are:

Weight matrix $W \in \mathbb{R}^{N \times N}$
Embedded coordinates $Y \in \mathbb{R}^{N \times d}$

Solution Methodology¶

The solution process involves three main steps:

Neighborhood Identification:

For each point $x_i$ , identify its k nearest neighbors using Euclidean distance
Define neighborhood matrix $\eta_{ij}$ where $\eta_{ij} = 1$ if $x_j$ is a neighbor of $x_i$

Weight Computation:

For each point $x_i$ , solve the local optimization:
$\min_{w_i} \left\|x_i - \sum_{j \in \mathcal{N}_i} w_{ij}x_j\right\|^2$
(3)
This reduces to solving the linear system:
$G_{jk}w_i = 1$
(4)
where $G_{jk} = (x_i - x_j)^T(x_i - x_k)$ is the local Gram matrix

Embedding Computation:

Define matrix $M = (I-W)^T(I-W)$
The optimal embedding coordinates are given by the eigenvectors of $M$ corresponding to its smallest nonzero eigenvalues
If $\lambda_0 \leq \lambda_1 \leq ... \leq \lambda_N$ are eigenvalues of $M$ , then: $Y = [\nu_1, \nu_2, ..., \nu_d]$ where $\nu_i$ is the eigenvector corresponding to $\lambda_i$

Global Optimality¶

The optimization landscape has several important properties:

Weight Computation:

Globally optimal solution exists for each local neighborhood
Solution uniqueness requires: $\text{rank}(G) = k-1$ where $k$ is the number of neighbors
Condition number $\kappa(G)$ affects numerical stability

Embedding Computation:

Global solution obtained via eigendecomposition
Uniqueness up to rotations and reflections
Solution quality depends on spectral gap $\lambda_{d+1} - \lambda_d$

Limitations include:

Topological Constraints:

Method assumes manifold is locally linear
Fails when: $\frac{\partial^2 f}{\partial x_i \partial x_j} \gg 0$ (high curvature) where $f$ describes the manifold

Scale Dependencies:

Results sensitive to neighborhood size $k$
Optimal $k$ relates to local curvature: $k \sim \mathcal{O}(\log N)$ where $N$ is sample size

Statistical Limitations:

Requires sufficient sampling density: $N > \mathcal{O}(d \log d)$ where $d$ is intrinsic dimension
Performance degrades with noise: $\sigma_{noise} \ll \sigma_{data}$ required for stability

Conclusion¶

The method provides global optimality for each step individually, but the combined solution depends critically on hyperparameter choices and data properties.

Manifold Learning

Locally Linear Embedding

Manifold Learning

Hessian Eigenmapping