Linking pages
- Transformers from scratch | peterbloem.nl http://peterbloem.nl/blog/transformers 40 comments
- GitHub - amitness/learning: A log of things I'm learning https://github.com/amitness/learning 17 comments
- How to scale training on multiple GPUs | by Giuliano Giacaglia | Towards Data Science https://towardsdatascience.com/how-to-scale-training-on-multiple-gpus-dae1041f49d2 0 comments
- Efficient PyTorch — Eliminating Bottlenecks | by Eugene Khvedchenya | Towards Data Science https://medium.com/@eugenekhvedchenya/efficient-pytorch-part-1-fe40ed5db76c 0 comments
Linked pages
- GNU Parallel - GNU Project - Free Software Foundation https://www.gnu.org/software/parallel/ 175 comments
- [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805 25 comments
- Improving language understanding with unsupervised learning https://blog.openai.com/language-unsupervised/ 18 comments
- GitHub - cybertronai/gradient-checkpointing: Make huge neural nets fit in memory https://github.com/openai/gradient-checkpointing 3 comments