[2109.08668] Primer: Searching for Efficient Transformers for Language Modeling - discu.eu

Reddit

[R] Primer: Searching for Efficient Transformers for Language Modeling. “We use evolution to design a new Transformer variant, called Primer. Primer has a better scaling law, and is 3X to 4X faster for training than Transformer for language modeling.” https://arxiv.org/abs/2109.08668 18 comments 21/9/2021 machinelearning

Linking pages

Google Research: Themes from 2021 and Beyond – Google AI Blog https://ai.googleblog.com/2022/01/google-research-themes-from-2021-and.html 52 comments
GitHub - lucidrains/x-transformers: A simple but complete full-attention transformer with a set of promising experimental features from various papers https://github.com/lucidrains/x-transformers 40 comments
Good News About the Carbon Footprint of Machine Learning Training – Google AI Blog https://ai.googleblog.com/2022/02/good-news-about-carbon-footprint-of.html 0 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:arxiv.org

Search title: [2109.08668] Primer: Searching for Efficient Transformers for Language Modeling

See how to search.

Submit link to: