4.3.12
Top-k Accuracy
Summary
- Understand the fundamentals of this metric, what it evaluates, and how to interpret the results.
- Compute and visualise the metric with Python 3.13 code examples, covering key steps and practical checkpoints.
- Combine charts and complementary metrics for effective model comparison and threshold tuning.
- Accuracy — understanding this concept first will make learning smoother
1. Definition #
If the model produces scores per class and \(S_k(x)\) denotes the top-k candidates for input \(x\), then
$$ \mathrm{Top\text{-}k\ Accuracy} = \frac{1}{n} \sum_{i=1}^n \mathbf{1}\{ y_i \in S_k(x_i) \} $$This is equivalent to Recall@k: a hit is recorded when the ground-truth label is among the top k predictions.
2. Computing in Python 3.13 #
| |
| |
top_k_accuracy_score consumes class probabilities and returns the metric for any value of k. Passing the labels array makes the computation robust to class-order differences.
3. Design considerations #
- Choosing k: Match the shortlists your UI or product can display.
- Ties: Equal scores can make rankings unstable; use stable sorting or add tie-breaker decimals when needed.
- Multiple ground truths: For genuine multi-label tasks, complement Top-k Accuracy with Precision@k and Recall@k.
4. Practical applications #
- Recommendation systems: Check whether the items users eventually choose appear in the suggested slate.
- Image classification: Large-class datasets (e.g., ImageNet) traditionally report Top-5 Accuracy.
- Search & retrieval: Evaluate if the relevant document shows up within the top 10 results.
5. Comparison with other ranking metrics #
| Metric | What it captures | Typical use case |
|---|---|---|
| Top-k Accuracy | Whether the correct label is in k | Shortlists where any hit counts |
| NDCG | Rank-weighted relevance | When placement near the top carries more value |
| MAP | Mean precision over ranks | Rankings with multiple relevant items |
| Hit Rate | Synonymous with Top-k Accuracy | Common in recommender-system literature |
Takeaways #
- Top-k Accuracy checks if the correct answer appears among the candidates—a simple yet powerful signal.
- scikit-learn’s
top_k_accuracy_scorelets you evaluate multiple k values with minimal effort. - Combine it with NDCG or MAP to account for rank quality and cases with several relevant options.