- [Research] Awesome Paper List of Vision Transformer & Attention https://github.com/cmhungsteve/Awesome-Transformer-Attention 7 comments deeplearning
- [R] Awesome Paper List of Vision Transformer & Attention https://github.com/cmhungsteve/Awesome-Transformer-Attention 6 comments machinelearning
Linking pages
- GitHub - zhimin-z/awesome-awesome-artificial-intelligence: A curated list of awesome curated lists of many topics closely related to artificial intelligence. https://github.com/zhimin-z/awesome-awesome-artificial-intelligence 15 comments
- GitHub - zhimin-z/awesome-awesome-machine-learning: A curated list of awesome curated lists of many topics closely related to machine learning. https://github.com/zhimin-z/awesome-awesome-machine-learning 0 comments
Linked pages
- Imagen: Text-to-Image Diffusion Models https://gweb-research-imagen.appspot.com/ 683 comments
- DALL·E 2 https://openai.com/dall-e-2/ 649 comments
- Imagen Video https://imagen.research.google/video/ 525 comments
- GitHub - microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities https://github.com/microsoft/unilm 104 comments
- GitHub - clovaai/donut: Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022 https://github.com/clovaai/donut 90 comments
- GitHub - sindresorhus/awesome: 😎 Awesome lists about all kinds of interesting topics https://github.com/sindresorhus/awesome 69 comments
- [2204.13807] A very preliminary analysis of DALL-E 2 https://arxiv.org/abs/2204.13807 65 comments
- [2105.01601] MLP-Mixer: An all-MLP Architecture for Vision https://arxiv.org/abs/2105.01601 59 comments
- [2108.08810] Do Vision Transformers See Like Convolutional Neural Networks? https://arxiv.org/abs/2108.08810 43 comments
- Parti: Pathways Autoregressive Text-to-Image Model https://parti.research.google/ 41 comments
- GitHub - THUDM/CogVideo: Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers" https://github.com/THUDM/CogVideo 33 comments
- [2112.10741] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models https://arxiv.org/abs/2112.10741 29 comments
- [2106.01548] When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations https://arxiv.org/abs/2106.01548 27 comments
- [2105.08050] Pay Attention to MLPs https://arxiv.org/abs/2105.08050 25 comments
- [2105.02723] Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet https://arxiv.org/abs/2105.02723 25 comments
- GitHub - PaddlePaddle/PaddleSeg: Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentation, Interactive Segmentation, Panoptic Segmentation, Image Matting, 3D Segmentation, etc. https://github.com/PaddlePaddle/PaddleSeg 23 comments
- [2103.14030] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows https://arxiv.org/abs/2103.14030 20 comments
- Phenaki https://phenaki.video/ 19 comments
- [2205.01917] CoCa: Contrastive Captioners are Image-Text Foundation Models https://arxiv.org/abs/2205.01917 14 comments
- [2103.03206] Perceiver: General Perception with Iterative Attention https://arxiv.org/abs/2103.03206 14 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.