Hacker News
- Retentive Network: A Successor to Transformer for Large Language Models https://arxiv.org/abs/2307.08621 19 comments
- Retentive Network: A Successor to Transformer for Large Language Models https://arxiv.org/abs/2307.08621 3 comments
- [Project] Unofficial implementation of Retentive Network (GitHub repo) https://arxiv.org/abs/2307.08621 13 comments machinelearning
Linking pages
- RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious https://www.latent.space/p/rwkv#%C2%A7the-eleuther-mafia 66 comments
- Vineeth's Blog https://vineeth.io/posts/2023/new-age-of-magic/ 46 comments
- Ahead of AI #11: New Foundation Models https://magazine.sebastianraschka.com/p/ahead-of-ai-11-new-foundation-models 34 comments
- GitHub - Jamie-Stirling/RetNet: An implementation of "Retentive Network: A Successor to Transformer for Large Language Models" https://github.com/Jamie-Stirling/RetNet 3 comments
- Absolute Unit NNs: Regression-Based MLPs for Everything · Gwern.net https://gwern.net/aunn 3 comments
- [AI] Retnet Model | Research https://latte4me.com/retnet-model/ 2 comments
- GitHub - HazyResearch/aisys-building-blocks: Building blocks for foundation models. https://github.com/HazyResearch/aisys-building-blocks 1 comment
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces https://gonzoml.substack.com/p/mamba-linear-time-sequence-modeling 0 comments
- Optimizing Distributed Training on Frontier for Large Language Models https://gonzoml.substack.com/p/optimizing-distributed-training-on 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:arxiv.org
Search title: Retentive Network: A Successor to Transformer for Large Language Models
See how to search.