Show HN: I built a tiny LLM to demystify how language models work
Built a ~9M param LLM from scratch to understand how they actually work.
Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch.
Trains in 5 min on a free Colab T4.
The fish thinks the meaning of life is food.
Fork it and swap the personality for your own character.
Comments URL: https://news.ycombinator.com/item?id=47655408 Points: 6 # Comments: 0
原文链接: HackerNews
