Linking pages
- It Looks Like You’re Trying To Take Over The World · Gwern.net https://www.gwern.net/fiction/Clippy 33 comments
- How AI Training Scales https://blog.openai.com/science-of-ai/ 15 comments
- Just Ask for Generalization | Eric Jang https://evjang.com/2021/10/23/generalization.html 6 comments
- AI and Efficiency https://openai.com/blog/ai-and-efficiency/ 1 comment
- Optimizing Elastic Deep Learning in GPU Clusters | CASL Project | PyTorch https://medium.com/pytorch/optimizing-elastic-deep-learning-in-gpu-clusters-with-adaptdl-for-pytorch-1d979b246d5d 1 comment
- Speeding Up Transformer Training and Inference By Increasing Model Size – The Berkeley Artificial Intelligence Research Blog https://bair.berkeley.edu/blog/2020/03/05/compress/ 0 comments
- AutoML | AutoRL: AutoML for RL https://www.automl.org/blog-autorl/ 0 comments
- What is Intelligence?. Exploring artificial and biological… | by Egor Dezhic | Towards Data Science https://towardsdatascience.com/what-is-intelligence-a69cbd8bb1b4 0 comments
- How to get 4x speedup and better generalization using the right batch size | by Daniel Huynh | Towards Data Science https://medium.com/@danielhuynh_48554/implementing-a-batch-size-finder-in-fastai-how-to-get-a-4x-speedup-with-better-generalization-813d686f6bdf 0 comments
- rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch – The Berkeley Artificial Intelligence Research Blog https://bair.berkeley.edu/blog/2019/09/24/rlpyt/ 0 comments
- GitHub - astooke/rlpyt: Reinforcement Learning in PyTorch https://github.com/astooke/rlpyt 0 comments
- GitHub - crowsonkb/k-diffusion: Karras et al. (2022) diffusion models for PyTorch https://github.com/crowsonkb/k-diffusion 0 comments
- Five years of progress in GPTs - by Finbarr Timbers https://finbarrtimbers.substack.com/p/five-years-of-progress-in-gpts 0 comments
- The Practitioner's Guide to the Maximal Update Parameterization | EleutherAI Blog https://blog.eleuther.ai/mutransfer/ 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [1812.06162] An Empirical Model of Large-Batch Training
See how to search.