Hacker News
- A stack of feed-forward layers does surprisingly well on ImageNet https://arxiv.org/abs/2105.02723 25 comments
Linking pages
- GitHub - cmhungsteve/Awesome-Transformer-Attention: An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites https://github.com/cmhungsteve/Awesome-Transformer-Attention 13 comments
- Gradient Update #1: FBI Usage of Facial Recognition and Rotary Embeddings For Large LM's https://thegradientpub.substack.com/p/update-1-fbi-usage-of-facial-recognition 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [2105.02723] Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet
See how to search.