Editors Pick

Why MAP and MRR Fail for Search Rating (and What to Use As an alternative)

often use Mean Reciprocal Rank (MRR) and Mean Average Precision (MAP) to evaluate the standard of their rankings. On this post, we are going to discuss why (MAP) and (MRR) poorly aligned with modern user behavior in...

Keeping Probabilities Honest: The Jacobian Adjustment

Introduction customer annoyance from wait times. Calls arrive randomly, so wait time X follows an Exponential distribution—most waits are short, just a few are painfully long. Now I’d argue that annoyance isn’t linear: a 10-minute...

Bonferroni vs. Benjamini-Hochberg: Selecting Your P-Value Correction

be a sensitive topic. Perhaps best avoided on first encounter with a Statistician. The disposition toward the subject has led to a tacit agreement that α = 0.05 is the gold standard—in fact,...

Understanding Vibe Proving

“What I cannot create, I don't understand” — attributed to R. Feynman After Vibe Coding, we appear to have entered the (very area of interest, but much cooler) era of Vibe Proving: DeepMind wins gold...

The best way to Do Evals on a Bloated RAG Pipeline

to Constructing an Overengineered Retrieval System. That one was about constructing the whole system. This one is about doing the evals for it. Within the previous article, I went through different parts of a RAG...

Understanding the Generative AI User

in some interesting conversations recently about designing LLM-based tools for end users, and one in every of the vital product design questions that this brings up is “what do people find out about...

Lessons Learned After 8 Years of Machine Learning

a decade old now. Back then, OpenAI felt like one (well-baked) startup amongst others. DeepMind was already around, but not yet fully integrated into Google. And, back then, the “triad of deep learning” —...

How I Optimized My Leaf Raking Strategy Using Linear Programming

, and it’s officially leaf-raking season. As I engaged on this tedious task, I noticed it is essentially one big optimization problem. When raking my leaves, I made 4 piles: one on either side of...

Recent posts

Popular categories

ASK ANA