Statistics

Causal ML for the Aspiring Data Scientist

: Limitations of Machine Learning As an information scientist in today’s digital age, it's essential to be equipped to reply quite a lot of questions that go far beyond easy pattern recognition. Typical machine learning...

From Transactions to Trends: Predict When a Customer Is About to Stop Buying

how math can solve so many problems in the actual world. Once I was in grade school, I definitely didn't see it that way. I never hated math, by the way in which,...

A Case for the T-statistic

Introduction undefined, I began eager about the parallels between point-anomaly detection and trend-detection. In relation to points, it’s generally intuitive, and the z-score solves most problems. What took me some time to determine was applying...

Google Trends is Misleading You: How one can Do Machine Learning with Google Trends Data

. What a present to society that is. If not for google trends, how would we've ever known that more Disney movies released within the 2000s led to fewer divorces within the UK. Or that drinking...

The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel

, we'll implement AUC in Excel. AUC is normally used for classification tasks as a performance metric. But we start with a confusion matrix, because that's where everyone begins in practice. Then we'll see why a...

Keeping Probabilities Honest: The Jacobian Adjustment

Introduction customer annoyance from wait times. Calls arrive randomly, so wait time X follows an Exponential distribution—most waits are short, just a few are painfully long. Now I’d argue that annoyance isn’t linear: a 10-minute...

The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel

, we ensemble learning with voting, bagging and Random Forest. Voting itself is simply an aggregation mechanism. It doesn't create diversity, but combines predictions from already different models.Bagging, however, explicitly creates diversity by training...

The Machine Learning “Advent Calendar” Day 19: Bagging in Excel

For 18 days, we've got explored many of the core machine learning models, organized into three major families: distance- and density-based models, tree- or rule-based models, and weight-based models. Up so far, each article focused...

Recent posts

Popular categories

ASK ANA