Omkaar Kamath

How to Train an LLM: Part 3
How to Train an LLM: Part 2
How to Train an LLM: Part 1
Cut Cross Entropy > Torch's version? No.
Cut Cross Entropy from first principles
Decisive guide on Speculative Decoding
The Illustrated FastVLM: Apple
So, why is attention quadratic?
The Illustrated LFM-2: Liquid AI
Floating point maths made simple
Why >> How
Why I made this website