-
Notifications
You must be signed in to change notification settings - Fork 2
Home
David Paluy edited this page May 4, 2026
·
5 revisions
Ruby port of DeepEval.
-
Understanding A/B Comparison — why paired t-tests with p-values are the right tool for comparing two models, and how to read the significance markers in
RubricLLM.compareoutput. -
Why Retrieval Metrics Are Pure Math — what
precision_at_k,recall_at_k,mrr,ndcg, andhit_rateactually compute, and why they're free of judge bias.