Statistics

Modeling Urban Walking Risk Using Spatial-Temporal Machine Learning

After dinner in downtown San Francisco, I said goodbye to friends and pulled out my phone to work out how you can get home. It was near 11:30 pm, and Uber estimates were unusually...

Causal ML for the Aspiring Data Scientist

: Limitations of Machine Learning As an information scientist in today’s digital age, it's essential to be equipped to reply quite a lot of questions that go far beyond easy pattern recognition. Typical machine learning...

From Transactions to Trends: Predict When a Customer Is About to Stop Buying

how math can solve so many problems in the actual world. Once I was in grade school, I definitely didn't see it that way. I never hated math, by the way in which,...

A Case for the T-statistic

Introduction undefined, I began eager about the parallels between point-anomaly detection and trend-detection. In relation to points, it’s generally intuitive, and the z-score solves most problems. What took me some time to determine was applying...

Google Trends is Misleading You: How one can Do Machine Learning with Google Trends Data

. What a present to society that is. If not for google trends, how would we've ever known that more Disney movies released within the 2000s led to fewer divorces within the UK. Or that drinking...

The Machine Learning “Advent Calendar” Bonus 1: AUC in Excel

, we'll implement AUC in Excel. AUC is normally used for classification tasks as a performance metric. But we start with a confusion matrix, because that's where everyone begins in practice. Then we'll see why a...

Keeping Probabilities Honest: The Jacobian Adjustment

Introduction customer annoyance from wait times. Calls arrive randomly, so wait time X follows an Exponential distribution—most waits are short, just a few are painfully long. Now I’d argue that annoyance isn’t linear: a 10-minute...

The Machine Learning “Advent Calendar” Day 20: Gradient Boosted Linear Regression in Excel

, we ensemble learning with voting, bagging and Random Forest. Voting itself is simply an aggregation mechanism. It doesn't create diversity, but combines predictions from already different models.Bagging, however, explicitly creates diversity by training...

Recent posts

Popular categories

ASK ANA