Omkaar's Recipes

How to Train an LLM: Part 4

How to Train an LLM: Part 3

How to Train an LLM: Part 2

How to Train an LLM: Part 1

Cut Cross Entropy > Torch's version? No.

Cut Cross Entropy from first principles

Decisive guide on Speculative Decoding

The Illustrated FastVLM: Apple

So, why is attention quadratic?

The Illustrated LFM-2: Liquid AI

Floating point maths made simple

Why I made this website