CNN vs. Vision Transformer: A Practitioner’s Guide to Selecting the Right Model | Tobias’ blog - discu.eu

Reddit

CNN vs. Vision Transformer: A Practitioner's Guide to Selecting the Right Model https://tobiasvanderwerff.github.io/2024/05/15/cnn-vs-vit.html 9 comments 17/5/2024 computervision

Linked pages

[1706.03762] Attention Is All You Need https://arxiv.org/abs/1706.03762 145 comments
[2103.14030] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows https://arxiv.org/abs/2103.14030 20 comments
Large Transformer Model Inference Optimization | Lil'Log https://lilianweng.github.io/posts/2023-01-10-inference-optimization/ 20 comments
[2111.05464] Are Transformers More Robust Than CNNs? https://arxiv.org/abs/2111.05464 9 comments
[2105.10497] Intriguing Properties of Vision Transformers https://arxiv.org/abs/2105.10497 6 comments
[2111.06377] Masked Autoencoders Are Scalable Vision Learners https://arxiv.org/abs/2111.06377 5 comments
[2201.03545] A ConvNet for the 2020s https://arxiv.org/abs/2201.03545 5 comments
[1902.07208] Transfusion: Understanding Transfer Learning for Medical Imaging https://arxiv.org/abs/1902.07208 0 comments
[2106.04560] Scaling Vision Transformers https://arxiv.org/abs/2106.04560 0 comments
[1912.11370] Big Transfer (BiT): General Visual Representation Learning https://arxiv.org/abs/1912.11370 0 comments
[1811.08883] Rethinking ImageNet Pre-training https://arxiv.org/abs/1811.08883 0 comments
[2105.07581] Vision Transformers are Robust Learners https://arxiv.org/abs/2105.07581 0 comments
[2010.11929] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale https://arxiv.org/abs/2010.11929 0 comments

Related searches:

Search whole site: site:tobiasvanderwerff.github.io

Search title: CNN vs. Vision Transformer: A Practitioner’s Guide to Selecting the Right Model | Tobias’ blog

See how to search.

Submit link to: