[2203.00555] DeepNet: Scaling Transformers to 1,000 Layers - discu.eu

Hacker News

DeepNet: Scaling Transformers to 1k Layers https://arxiv.org/abs/2203.00555 38 comments 2/3/2022

Reddit

[R] DeepNet: Scaling Transformers to 1,000 Layers https://arxiv.org/abs/2203.00555 9 comments 3/3/2022 machinelearning

Linking pages

GitHub - microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities https://github.com/microsoft/unilm 104 comments
GitHub - borisdayma/dalle-mini: DALL·E Mini - Generate images from a text prompt https://github.com/borisdayma/dalle-mini 11 comments
Microsoft Asia Researchers Scale Transformers to 1,000 Layers; BYD EV Armed with Baidu Self-Driving Tech; Future NLP Challenges | by Recode China AI | Medium https://recodechinaai.medium.com/microsoft-asia-researchers-scale-transformers-to-1-000-layers-byd-ev-armed-with-baidu-self-driving-e39a1ef6e5df 1 comment
borisdayma/dalle-mini – Run with an API on Replicate https://replicate.com/borisdayma/dalle-mini 0 comments
Transformer Taxonomy (the last lit review) | kipply's blog https://kipp.ly/blog/transformer-taxonomy/ 0 comments
GitHub - RUCAIBox/LLMSurvey: The official GitHub page for the survey paper "A Survey of Large Language Models". https://github.com/RUCAIBox/LLMSurvey 0 comments
GitHub - AIoT-MLSys-Lab/Efficient-LLMs-Survey: Efficient Large Language Models: A Survey https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey 0 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:arxiv.org

Search title: [2203.00555] DeepNet: Scaling Transformers to 1,000 Layers

See how to search.

Submit link to: