Hacker News
- [D] LLaMA training vs. GPU time: smaller models seem better for a given budget https://espadrine.github.io/blog/posts/chinchilla-s-death.html#Can_Chinchillas_picture_a_Llama_s_sights_ 8 comments machinelearning
Linking pages
Linked pages
- Ultraviolet catastrophe - Wikipedia https://en.wikipedia.org/wiki/Ultraviolet_catastrophe 17 comments
- [2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
- [2001.08361] Scaling Laws for Neural Language Models https://arxiv.org/abs/2001.08361 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:espadrine.github.io
Search title: Chinchilla’s Death
See how to search.