— that he saw further only by standing on the shoulders of giants — captures a timeless truth about science. Every breakthrough rests on countless layers of prior progress, until someday … all...
how neural networks learned. Train them, watch the loss go down, save checkpoints every epoch. Standard workflow. Then I measured training dynamics at 5-step intervals as an alternative of epoch-level, and all the...
with my series of AI paper recommendations. My long-term followers might recall the 4 previous editions (, , , and ). I’ve been away from writing for quite a while, and I couldn’t...
, someone claims they’ve invented a revolutionary AI architecture. But if you see the identical mathematical pattern — selective amplification + normalization — emerge independently from gradient descent, evolution, and chemical reactions, you realize...
Welcome back to the Tiny Giant series — a series where I share what I learned about MobileNet architectures. Up to now two articles I covered MobileNetV1 and MobileNetV2. Take a look at references ...
the way you’d teach a robot to land a drone without programming each move? That’s exactly what I got down to explore. I spent weeks constructing a game where a virtual drone has...
While working on my Knowledge Distillation problem for intent classification, I faced a puzzling roadblock. My setup involved a teacher model, which is RoBERTa-large (finetuned on my intent classification), and a student model, which...