Monarch Mixer: Revisiting BERT, Without Attention or MLPs · Hazy Research - discu.eu

Hacker News

How to scale LLMs better with an alternative to transformers https://hazyresearch.stanford.edu/blog/2023-07-25-m2-bert 31 comments 27/7/2023

Linking pages

RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious https://www.latent.space/p/rwkv#%C2%A7the-eleuther-mafia 66 comments
State-space LLMs: Do we need Attention? https://www.interconnects.ai/p/llms-beyond-attention 1 comment
GitHub - HazyResearch/aisys-building-blocks: Building blocks for foundation models. https://github.com/HazyResearch/aisys-building-blocks 1 comment

Related searches:

Search whole site: site:hazyresearch.stanford.edu

Search title: Monarch Mixer: Revisiting BERT, Without Attention or MLPs · Hazy Research

See how to search.

Submit link to: