Skip to article frontmatterSkip to article content
Global vs. Local
Global are methods that try to preserve the overall structure of the data, while local methods focus on preserving relationships within local neighborhoods.
AlgorithmApproachGoalStrengthsWeaknessesComputational CostGlobal/Local
MDS (Multi-dimensional Scaling)Preserving pairwise distancesEmbed data points in a lower dimension while preserving distancesWorks well when pairwise distances are meaningful, classic methodComputationally expensive for large datasets, sensitive to noise in distance measurements, global optimization can be hardHighGlobal
IsoMapGeodesic distance on neighborhood graphPreserve global geodesic distancesRobust to outliers, captures global structure wellSensitive to “shortcuts” in graph construction, computationally expensive for large datasetsHighGlobal
Locally Linear EmbeddingLocal linear reconstruction of pointsPreserve local linear relationshipsComputationally efficient, good for locally smooth manifoldsSensitive to noise, can have issues with non-convex manifolds, requires careful neighborhood selectionMediumLocal
MLLE (Modified LLE)Multiple local linear reconstructionsImprove robustness of LLE to noise and sampling density variationsMore robust to noise and sampling variations than LLEMore computationally expensive than LLE, still sensitive to neighborhood selectionMedium-HighLocal
HLLE (Hessian Eigenmapping)Hessian of the manifoldCapture local curvature informationLess sensitive to parameter tuning than LLE variants, can handle some non-convexitiesComputationally expensive, can be sensitive to noiseHighLocal
LTSA (Local Tangent Space Alignment)Aligning local tangent spacesPreserve local geometry by aligning tangent spacesRobust to noise, can handle some non-convexitiesComputationally more expensive than LLE, requires careful parameter tuningMedium-HighLocal
Spectral Embedding (Laplacian Eigenmaps)Graph Laplacian EigenvectorsPreserve local neighborhood relationships (similar to LLE in spirit)Computationally efficient, widely applicableSensitive to noise, can have issues with disconnected graphsMediumLocal
t-SNE (t-distributed Stochastic Neighbor Embedding)Probabilistic similarity based on t-distributionVisualize high-dimensional data in 2D or 3DExcellent for visualization, reveals clusters wellComputationally intensive, sensitive to parameter tuning (perplexity), global structure is not well preserved, can create misleading “clusters”HighPrimarily Local, with some global tendencies