まとめ
- Hit Rate measures the proportion of users (or queries) whose top-k list contains at least one relevant item.
- It evaluates coverage by calculating Hit Rate@k from each user’s candidate list.
- We review its relation to Recall and Precision and when to use it.
1. Definition #
For a given query, let \(G_q\) be the set of relevant items and \(S_{q,k}\) the top-k retrieved items: $$ \mathrm{HR@k} = \frac{1}{Q} \sum_{q=1}^Q \mathbf{1}{ G_q \cap S_{q,k} \ne \emptyset } $$ If at least one relevant item is found, the score is 1; otherwise 0. The overall Hit Rate is the mean across queries.
2. Computing in Python #
import numpy as np
def hit_rate_at_k(y_true: np.ndarray, y_score: np.ndarray, k: int) -> float:
"""Return 1.0 if there is at least one positive item in the top-k predictions.
Args:
y_true: Binary label array (1 for relevant).
y_score: Model output scores.
k: Number of top-ranked items to consider.
Returns:
1.0 if hit exists, otherwise 0.0.
"""
idx = np.argsort(-y_score)[:k]
return 1.0 if y_true[idx].sum() > 0 else 0.0
hits = [hit_rate_at_k(y_true[q], y_score[q], k=10) for q in range(len(queries))]
hr_at_10 = np.mean(hits)
print("Hit Rate@10:", round(hr_at_10, 3))
When there are multiple queries (or users), average the hit values.
It resembles Recall@k, but focuses only on whether any correct item appears in the top-k.
3. Advantages #
- Intuitive: Expresses directly the percentage of users who received at least one correct result.
- More lenient than Recall@k: Counts success if any correct item appears, even if others are missed.
- Useful for early-stage evaluation: Provides a rough but informative indicator for initial model comparison.
4. Practical Applications #
- Recommendation systems: Measure the percentage of users whose clicked or purchased item was in the candidate list.
- E-commerce: Check whether personalized or popular products are included in recommendations.
- A/B testing: Before running online experiments, filter models by improvements in HR@k offline.
5. Relation to Other Metrics #
| Metric | Description | Why Combine |
|---|---|---|
| Hit Rate | Binary success per user | Simple, but doesn’t show completeness |
| Recall@k | Correct items / total relevant items | Captures completeness when multiple relevant items exist |
| Precision@k | Correct items / retrieved items | Detects noise even if Hit Rate is high |
| NDCG | Weighted by rank quality | Shows whether the hit appears at the top |
Summary #
- Hit Rate is a simple and interpretable metric indicating whether users received at least one correct item.
- Combining it with Recall@k or NDCG provides deeper insight into both coverage and ranking quality.
- It’s a practical KPI for early evaluation and business-level explanations.