Hacker News
- [R] DeepNet: Scaling Transformers to 1,000 Layers https://arxiv.org/abs/2203.00555 9 comments machinelearning
Linking pages
- GitHub - microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities https://github.com/microsoft/unilm 104 comments
- GitHub - borisdayma/dalle-mini: DALL·E Mini - Generate images from a text prompt https://github.com/borisdayma/dalle-mini 11 comments
- Microsoft Asia Researchers Scale Transformers to 1,000 Layers; BYD EV Armed with Baidu Self-Driving Tech; Future NLP Challenges | by Recode China AI | Medium https://recodechinaai.medium.com/microsoft-asia-researchers-scale-transformers-to-1-000-layers-byd-ev-armed-with-baidu-self-driving-e39a1ef6e5df 1 comment
- borisdayma/dalle-mini – Run with an API on Replicate https://replicate.com/borisdayma/dalle-mini 0 comments
- Transformer Taxonomy (the last lit review) | kipply's blog https://kipp.ly/blog/transformer-taxonomy/ 0 comments
- GitHub - RUCAIBox/LLMSurvey: The official GitHub page for the survey paper "A Survey of Large Language Models". https://github.com/RUCAIBox/LLMSurvey 0 comments
- GitHub - AIoT-MLSys-Lab/Efficient-LLMs-Survey: Efficient Large Language Models: A Survey https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:arxiv.org
Search title: [2203.00555] DeepNet: Scaling Transformers to 1,000 Layers
See how to search.