Linking pages
- Mistral 7B | Mistral AI | Open source models https://mistral.ai/news/announcing-mistral-7b/ 618 comments
- Jukebox https://openai.com/blog/jukebox/ 130 comments
- How GPT3 Works - Visualizations and Animations – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/how-gpt3-works-visualizations-animations/ 109 comments
- The Transformer Family Version 2.0 | Lil'Log https://lilianweng.github.io/posts/2023-01-27-the-transformer-family-v2/ 46 comments
- Generating music in the waveform domain – Sander Dieleman https://sander.ai/2020/03/24/audio-generation.html 41 comments
- 10 Noteworthy AI Research Papers of 2023 https://magazine.sebastianraschka.com/p/10-ai-research-papers-2023 24 comments
- GitHub - JUSTSUJAY/ML-Research-Papers https://github.com/JUSTSUJAY/ML-Research-Papers 10 comments
- Generative Modeling with Sparse Transformers https://openai.com/blog/sparse-transformer/ 9 comments
- Aman's AI Journal • Primers • Overview of Large Language Models https://aman.ai/primers/ai/LLM/ 1 comment
- GitHub - amrzv/awesome-colab-notebooks: Collection of google colaboratory notebooks for fast and easy experiments https://github.com/amrzv/awesome-colab-notebooks 0 comments
- NLP Newsletter #10 [EN]: Improving Reproducibility in ML, Privacy and Security in NLP, XTREME, Longformer, VilBERT, exBERT,… – DAIR.AI https://dair.ai/NLP_Newsletter_10_en/ 0 comments
- Generating music in the waveform domain – Sander Dieleman https://benanne.github.io/2020/03/24/audio-generation.html 0 comments
- A Survey of Long-Term Context in Transformers https://www.pragmatic.ml/a-survey-of-methods-for-incorporating-long-term-context/ 0 comments
- OpenAI and the road to text-guided image generation: DALL·E, CLIP, GLIDE, DALL·E 2 (unCLIP) | by Grigory Sapunov | Intento https://blog.inten.to/openai-and-the-road-to-text-guided-image-generation-dall-e-clip-glide-dall-e-2-unclip-c6e28f7194ea?gi=53c11ab07fab 0 comments
- GPT-3: Language Models are Few-Shot Learners | by Grigory Sapunov | Intento https://blog.inten.to/gpt-3-language-models-are-few-shot-learners-a13d1ae8b1f9 0 comments
- Speeding up BERT. How to make BERT models faster | by Grigory Sapunov | Intento https://blog.inten.to/speeding-up-bert-5528e18bb4ea 0 comments
- GitHub - tomohideshibata/BERT-related-papers: BERT-related papers https://github.com/tomohideshibata/BERT-related-papers 0 comments
- OpenAI Sparse Transformer Improves Predictable Sequence Length by 30x | by Synced | SyncedReview | Medium https://medium.com/syncedreview/openai-sparse-transformer-improves-predictable-sequence-length-by-30x-5a65ef2592b9 0 comments
- Transformer Taxonomy (the last lit review) | kipply's blog https://kipp.ly/blog/transformer-taxonomy/ 0 comments
- How does GPT-3 spend its 175B parameters? - by Robert Huben https://aizi.substack.com/p/how-does-gpt-3-spend-its-175b-parameters 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [1904.10509] Generating Long Sequences with Sparse Transformers
See how to search.