Linking pages
- Emulating Humans with NSFW Chatbots - with Jesse Silver https://www.latent.space/p/nsfw-chatbots 1 comment
- Research Papers in January 2024 - by Sebastian Raschka, PhD https://magazine.sebastianraschka.com/p/research-papers-in-january-2024 0 comments
- Decoder-Only Transformers: The Workhorse of Generative LLMs https://cameronrwolfe.substack.com/p/decoder-only-transformers-the-workhorse 0 comments
Linked pages
- character.ai https://beta.character.ai 179 comments
- [2401.04088] Mixtral of Experts https://arxiv.org/abs/2401.04088 151 comments
- https://twitter.com/lmsysorg/status/1750921228012122526 91 comments
- 🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers 81 comments
- New embedding models and API updates https://openai.com/blog/new-embedding-models-and-api-updates 79 comments
- [2311.03099] Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch https://arxiv.org/abs/2311.03099 70 comments
- [2203.05482] Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time https://arxiv.org/abs/2203.05482 14 comments
- chargoddard/llama2-22b · Hugging Face https://huggingface.co/chargoddard/llama2-22b 1 comment
- waifu-research-department (The Waifu Research Department) https://huggingface.co/waifu-research-department 1 comment
- PyTorch 1.6 now includes Stochastic Weight Averaging | PyTorch https://pytorch.org/blog/pytorch-1.6-now-includes-stochastic-weight-averaging/ 0 comments
- Loss Landscape | A.I deep learning explorations of morphology & dynamics https://losslandscape.com/ 0 comments
- Andrew Gordon Wilson https://cims.nyu.edu/~andrewgw/ 0 comments
- Techno-Optimist vs AI Doomer: Social Fabric And Media https://www.semianalysis.com/p/ai-doomer-vs-techno-optimist-social 0 comments
- https://arxiv.org/abs/2310.11564 0 comments
- Perfecting Merge-kit MoE's - Google Docs https://docs.google.com/document/d/1_vOftBnrk9NRk5h10UqrfJ5CDih9KBKL61yvrZtVWPE/edit 0 comments
- GitHub - cg123/mergekit: Tools for merging pretrained large language models. https://github.com/cg123/mergekit 0 comments
- GitHub - Mihaiii/llm_steer: Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors https://github.com/Mihaiii/llm_steer 0 comments
- Google’s Lumiere brings AI video closer to real than unreal - The Verge https://www.theverge.com/2024/1/27/24052140/google-lumiere-ai-video-generation-runway-pika 0 comments
- Google Colab https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing 0 comments
Related searches:
Search whole site: site:interconnects.ai
Search title: Model merging lessons in The Waifu Research Department
See how to search.