Vision Language models: towards multi-modal deep learning

Linking pages

How diffusion models work: the math from scratch | AI Summer https://theaisummer.com/diffusion-models/ 10 comments
GitHub - vlgiitr/DL_Topics: List of DL topics and resources essential for cracking interviews https://github.com/vlgiitr/DL_Topics 1 comment

Linked pages

DALL·E: Creating Images from Text https://openai.com/blog/dall-e/ 461 comments
[2112.10741] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models https://arxiv.org/abs/2112.10741 29 comments
[1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805 25 comments
[2103.14030] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows https://arxiv.org/abs/2103.14030 20 comments
What are Diffusion Models? | Lil'Log https://lilianweng.github.io/posts/2021-07-11-diffusion-models/ 18 comments
CLIP: Connecting Text and Images https://openai.com/blog/clip/ 15 comments
[2108.07258] On the Opportunities and Risks of Foundation Models https://arxiv.org/abs/2108.07258 11 comments
Diffusion Models https://lilianweng.github.io/lil-log/2021/07/11/diffusion-models.html 10 comments
The theory behind Latent Variable Models: formulating a Variational Autoencoder | AI Summer https://theaisummer.com/latent-variable-models/#variational-autoencoders 4 comments
[2111.11432] Florence: A New Foundation Model for Computer Vision https://arxiv.org/abs/2111.11432 2 comments
[1609.02200] Discrete Variational Autoencoders http://arxiv.org/abs/1609.02200 1 comment
[2101.00529] VinVL: Revisiting Visual Representations in Vision-Language Models https://arxiv.org/abs/2101.00529 0 comments
Faster R-CNN Explained for Object Detection Tasks | Paperspace Blog https://blog.paperspace.com/faster-r-cnn-explained-object-detection/ 0 comments
How Attention works in Deep Learning: understanding the attention mechanism in sequence models | AI Summer https://theaisummer.com/attention/ 0 comments
[2102.12092] Zero-Shot Text-to-Image Generation https://arxiv.org/abs/2102.12092 0 comments
How Transformers work in deep learning and NLP: an intuitive introduction | AI Summer https://theaisummer.com/transformer/ 0 comments