Hacker News
- Modded-NanoGPT: NanoGPT (124M) quality in 3.25B tokens https://github.com/KellerJordan/modded-nanogpt 11 comments
Linking pages
- GitHub - KellerJordan/Muon: Muon optimizer for neural networks https://github.com/KellerJordan/Muon 0 comments
- [AINews] LMSys killed Model Versioning (gpt 4o 1120, gemini exp 1121) • Buttondown https://buttondown.com/ainews/archive/ainews-lmsys-killed-model-versioning-gpt-4o-1120/ 0 comments
- Muon: An optimizer for hidden layers in neural networks | Keller Jordan blog https://kellerjordan.github.io/posts/muon/ 0 comments
Linked pages
- GitHub - karpathy/llm.c: LLM training in simple, raw C/CUDA https://github.com/karpathy/llm.c 169 comments
- Reproducing GPT-2 (124M) in llm.c in 90 minutes for $20 · karpathy/llm.c · Discussion #481 · GitHub https://github.com/karpathy/llm.c/discussions/481 117 comments
- GitHub - KellerJordan/cifar10-airbench: 94% on CIFAR-10 in 3.29 seconds 💨 https://github.com/KellerJordan/cifar10-airbench 1 comment
Related searches:
Search whole site: site:github.com
Search title: GitHub - KellerJordan/modded-nanogpt: NanoGPT (124M) quality in 3.25B tokens
See how to search.