GitHub - sustcsonglin/flash-linear-attention: Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Linking pages

GitHub - ridgerchu/matmulfreellm: Implementation for MatMul-free LM. https://github.com/ridgerchu/matmulfreellm 1 comment

Linked pages

GitHub - BlinkDL/RWKV-LM: RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding. https://github.com/BlinkDL/RWKV-LM 179 comments
[2305.13048] RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/abs/2305.13048 171 comments
PyTorch http://pytorch.org/ 100 comments
https://arxiv.org/abs/2307.08621 36 comments
[2311.01927] GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling https://arxiv.org/abs/2311.01927 23 comments
[2307.14995] Scaling TransNormer to 175 Billion Parameters https://arxiv.org/abs/2307.14995 22 comments
GitHub - openai/triton: Development repository for the Triton language and compiler https://github.com/openai/triton 5 comments
[2102.11174] Linear Transformers Are Secretly Fast Weight Programmers https://arxiv.org/abs/2102.11174 2 comments
[2310.01655] PolySketchFormer: Fast Transformers via Sketches for Polynomial Kernels https://arxiv.org/abs/2310.01655 1 comment
Zoology (Blogpost 2): Simple, Input-Dependent, and Sub-Quadratic Sequence Mixers · Hazy Research https://hazyresearch.stanford.edu/blog/2023-12-11-zoology2-based 1 comment
[2006.16236] Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention https://arxiv.org/abs/2006.16236 0 comments
GitHub - EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of autoregressive language models. https://github.com/EleutherAI/lm-evaluation-harness 0 comments
[2312.06635] Gated Linear Attention Transformers with Hardware-Efficient Training https://arxiv.org/abs/2312.06635 0 comments
[2404.05892] Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence https://arxiv.org/abs/2404.05892 0 comments
Einops https://einops.rocks/ 0 comments