Linking pages
Linked pages
- pre-commit https://pre-commit.com 96 comments
- [2212.14034] Cramming: Training a Language Model on a Single GPU in One Day https://arxiv.org/abs/2212.14034 25 comments
- GitHub - HazyResearch/flash-attention: Fast and memory-efficient exact attention https://github.com/HazyResearch/flash-attention 3 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - JonasGeiping/cramming: Cramming the training of a (BERT-type) language model into limited compute.
See how to search.