Isomap.en
Summary
- Isomap builds a neighborhood graph, estimates geodesic distances, and embeds points to preserve those distances.
- It is effective when data lies on curved manifolds such as Swiss-roll-like structures.
_neighbors is critical because it controls graph connectivity and embedding quality.
- Principal Component Analysis (PCA) — understanding this concept first will make learning smoother
Intuition #
Isomap replaces straight-line distance with distance measured along the data manifold. By preserving manifold distances, it can unroll curved geometry into a meaningful low-dimensional map.
Detailed Explanation #
1. Idea #
- Build a neighbour graph with either a fixed number of neighbours (k) or a radius (\varepsilon).
- Compute the shortest-path (geodesic) distances along that graph for every pair of samples.
- Feed the resulting distance matrix to classical MDS to obtain the low-dimensional embedding.
This workflow keeps far-apart regions separated while unrolling the curved surface.
2. Python example #
| |
3. Hyperparameters #
n_neighbors: governs the local neighbourhood; too small breaks the graph, too large washes out the manifold structure.n_components: usually 2 or 3 for visualisation, but higher values are possible.- Handle duplicates/noisy samples by scaling features or adding slight noise before building the graph.
4. Pros and cons #
| Pros | Cons |
|---|---|
| Preserves manifold structure and distances | Sensitive to noisy neighbour graphs |
| Produces intuitive visualisations | Requires computing all-pairs shortest paths |
5. Notes #
- Isomap = neighbourhood graph + MDS; pick neighbours carefully to reflect the true topology.
- Inspect the connected components of the graph: isolated points will distort the embedding.
- Consider UMAP or t-SNE if you need faster embeddings or better preservation of local densities.