Retentive Network: A Successor to Transformer for Large Language Models - discu.eu

Hacker News

Retentive Network: A Successor to Transformer for Large Language Models https://arxiv.org/abs/2307.08621 19 comments 23/7/2023

Retentive Network: A Successor to Transformer for Large Language Models https://arxiv.org/abs/2307.08621 3 comments 18/7/2023

Reddit

[Project] Unofficial implementation of Retentive Network (GitHub repo) https://arxiv.org/abs/2307.08621 13 comments 19/7/2023 machinelearning

Linking pages

RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious https://www.latent.space/p/rwkv#%C2%A7the-eleuther-mafia 66 comments
Vineeth's Blog https://vineeth.io/posts/2023/new-age-of-magic/ 46 comments
Ahead of AI #11: New Foundation Models https://magazine.sebastianraschka.com/p/ahead-of-ai-11-new-foundation-models 34 comments
GitHub - Jamie-Stirling/RetNet: An implementation of "Retentive Network: A Successor to Transformer for Large Language Models" https://github.com/Jamie-Stirling/RetNet 3 comments
Absolute Unit NNs: Regression-Based MLPs for Everything · Gwern.net https://gwern.net/aunn 3 comments
[AI] Retnet Model | Research https://latte4me.com/retnet-model/ 2 comments
GitHub - HazyResearch/aisys-building-blocks: Building blocks for foundation models. https://github.com/HazyResearch/aisys-building-blocks 1 comment
Mamba: Linear-Time Sequence Modeling with Selective State Spaces https://gonzoml.substack.com/p/mamba-linear-time-sequence-modeling 0 comments
Optimizing Distributed Training on Frontier for Large Language Models https://gonzoml.substack.com/p/optimizing-distributed-training-on 0 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:arxiv.org

Search title: Retentive Network: A Successor to Transformer for Large Language Models

See how to search.

Submit link to: