Linking pages
Linked pages
- [2311.03099] Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch https://arxiv.org/abs/2311.03099 70 comments
- [2305.11206] LIMA: Less Is More for Alignment https://arxiv.org/abs/2305.11206 44 comments
- [1803.03635] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks https://arxiv.org/abs/1803.03635 32 comments
- Oracle machine - Wikipedia http://en.wikipedia.org/wiki/Oracle_machine 18 comments
- [2203.05482] Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time https://arxiv.org/abs/2203.05482 14 comments
- [1905.11946] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks https://arxiv.org/abs/1905.11946 10 comments
- [1810.04882] Towards Understanding Linear Word Analogies https://arxiv.org/abs/1810.04882 9 comments
- [1512.03385] Deep Residual Learning for Image Recognition http://arxiv.org/abs/1512.03385 6 comments
- Interpolation - Wikipedia http://en.wikipedia.org/wiki/Interpolation 4 comments
- ImageNet Benchmark (Image Classification) | Papers With Code https://paperswithcode.com/sota/image-classification-on-imagenet 3 comments
- [2111.10050] Combined Scaling for Open-Vocabulary Image Classification https://arxiv.org/abs/2111.10050 2 comments
- upstage/SOLAR-10.7B-v1.0 · Hugging Face https://huggingface.co/upstage/SOLAR-10.7B-v1.0 2 comments
- [1512.00567] Rethinking the Inception Architecture for Computer Vision http://arxiv.org/abs/1512.00567 1 comment
- [2106.04803] CoAtNet: Marrying Convolution and Attention for All Data Sizes https://arxiv.org/abs/2106.04803 1 comment
- [2305.14201] Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks https://arxiv.org/abs/2305.14201 1 comment
- [1712.09913] Visualizing the Loss Landscape of Neural Nets https://arxiv.org/abs/1712.09913 0 comments
- [2106.04560] Scaling Vision Transformers https://arxiv.org/abs/2106.04560 0 comments
- [1301.3781] Efficient Estimation of Word Representations in Vector Space https://arxiv.org/abs/1301.3781 0 comments
- [1412.6980] Adam: A Method for Stochastic Optimization http://arxiv.org/abs/1412.6980 0 comments
- Slerp - Wikipedia https://en.wikipedia.org/wiki/Slerp 0 comments
Related searches:
Search whole site: site:cameronrwolfe.substack.com
Search title: Model Merging: A Survey - by Cameron R. Wolfe, Ph.D.
See how to search.