Hacker News
- TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters https://arxiv.org/abs/2410.23168 32 comments
- [R] TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters https://arxiv.org/abs/2410.23168 5 comments machinelearning
Linking pages
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:arxiv.org
Search title: [2410.23168] TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
See how to search.