deep learning

Your Next ‘Large’ Language Model Might Not Be Large After All

For the reason that conception of AI, researchers have all the time held faith in scale — that general intelligence was an emergent property born out of size. If we just carry on adding...

How Relevance Models Foreshadowed Transformers for NLP

— that he saw further only by standing on the shoulders of giants — captures a timeless truth about science. Every breakthrough rests on countless layers of prior progress, until someday … all...

I Measured Neural Network Training Every 5 Steps for 10,000 Iterations

how neural networks learned. Train them, watch the loss go down, save checkpoints every epoch. Standard workflow. Then I measured training dynamics at 5-step intervals as an alternative of epoch-level, and all the...

AI Papers to Read in 2025

with my series of AI paper recommendations. My long-term followers might recall the 4 previous editions (, , , and ). I’ve been away from writing for quite a while, and I couldn’t...

We Didn’t Invent Attention — We Just Rediscovered It

, someone claims they’ve invented a revolutionary AI architecture. But if you see the identical mathematical pattern — selective amplification + normalization — emerge independently from gradient descent, evolution, and chemical reactions, you realize...

MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter

Welcome back to the Tiny Giant series — a series where I share what I learned about MobileNet architectures. Up to now two articles I covered MobileNetV1 and MobileNetV2. Take a look at references ...

Deep Reinforcement Learning: 0 to 100

the way you’d teach a robot to land a drone without programming each move? That’s exactly what I got down to explore. I spent weeks constructing a game where a virtual drone has...

Recent posts

Popular categories

ASK ANA