Linking pages
- Mamba Explained | Kola Ayonrinde https://www.kolaayonrinde.com/blog/2024/02/11/mamba.html 93 comments
- 🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers 81 comments
- RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious https://www.latent.space/p/rwkv#%C2%A7the-eleuther-mafia 66 comments
- Mamba Explained https://thegradient.pub/mamba-explained/ 44 comments
- 🦅 EagleX 1.7T : Soaring past LLaMA 7B 2T in both English and Multi-lang evals (RWKV-v5) https://substack.recursal.ai/p/eaglex-17t-soaring-past-llama-7b 9 comments
- State-space LLMs: Do we need Attention? https://www.interconnects.ai/p/llms-beyond-attention 1 comment
- FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI https://www.latent.space/p/flashattention 0 comments
- 🦅 EagleX v2 : Soaring past LLaMA2 7B in both English and Multi-lang evals (RWKV-v5) https://blog.rwkv.com/p/eaglex-v2-soaring-past-llama2-7b 0 comments
- Incremental Gambits and Premature Endgames | Matthew Lewis https://matthewlewis.xyz/blog/2024/08/29/incremental-gambits-and-premature-endgames.html 0 comments
Related searches:
Search whole site: site:www.isattentionallyouneed.com
Search title: Is Attention All You Need?
See how to search.