[2112.05682] Self-attention Does Not Need $O(n^2)$ Memory - discu.eu

Reddit

[Discussion] Question on the paper named, SELF-ATTENTION DOES NOT NEED O(n 2 ) MEMORY from Google. https://arxiv.org/abs/2112.05682 19 comments 17/9/2023 machinelearning

[R] Self-attention Does Not Need $O(n^2)$ Memory https://arxiv.org/abs/2112.05682 18 comments 13/12/2021 machinelearning

Linking pages

GitHub - lucidrains/x-transformers: A simple but complete full-attention transformer with a set of promising experimental features from various papers https://github.com/lucidrains/x-transformers 40 comments
GitHub - lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. https://github.com/lm-sys/FastChat 4 comments
Ring Attention Explained | Coconut Mode https://coconut-mode.com/posts/ring-attention/ 2 comments
GitHub - aqlaboratory/openfold: Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2 https://github.com/aqlaboratory/openfold 1 comment
GitHub - HazyResearch/aisys-building-blocks: Building blocks for foundation models. https://github.com/HazyResearch/aisys-building-blocks 1 comment
Google Proposes a ‘Simple Trick’ for Dramatically Reducing Transformers’ (Self-)Attention Memory Requirements | Synced https://syncedreview.com/2021/12/14/deepmind-podracer-tpu-based-rl-frameworks-deliver-exceptional-performance-at-low-cost-165/ 0 comments
Transformer Taxonomy (the last lit review) | kipply's blog https://kipp.ly/blog/transformer-taxonomy/ 0 comments
Five years of progress in GPTs - by Finbarr Timbers https://finbarrtimbers.substack.com/p/five-years-of-progress-in-gpts 0 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:arxiv.org

Search title: [2112.05682] Self-attention Does Not Need $O(n^2)$ Memory

See how to search.

Submit link to: