Hacker News
- Ten Noteworthy AI Research Papers of 2023 https://magazine.sebastianraschka.com/p/10-ai-research-papers-2023 19 comments
- [P] Ten Noteworthy AI Research Papers of 2023 https://magazine.sebastianraschka.com/p/10-ai-research-papers-2023 5 comments machinelearning
Linking pages
- 🔥 Top AI Newsletters of 2024 on Substack https://aisupremacy.substack.com/p/top-ai-newsletters-of-2024-on-substack 0 comments
- The Four Wars of the AI Stack (Dec 2023 Recap) https://www.latent.space/p/dec-2023 0 comments
- The Four Wars of the AI Stack (Dec 2023 Recap) https://www.latent.space/i/140396949/mixtral-sparks-a-gpuinference-war 0 comments
Linked pages
- Mixtral of experts | Mistral AI | Open source models https://mistral.ai/news/mixtral-of-experts/ 300 comments
- [2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs https://arxiv.org/abs/2305.14314 129 comments
- [1701.06538] Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer https://arxiv.org/abs/1701.06538 125 comments
- [2310.06825] Mistral 7B https://arxiv.org/abs/2310.06825 124 comments
- Phi-2: The surprising power of small language models - Microsoft Research https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/ 121 comments
- [2305.15717] The False Promise of Imitating Proprietary LLMs https://arxiv.org/abs/2305.15717 119 comments
- [2306.11644] Textbooks Are All You Need https://arxiv.org/abs/2306.11644 106 comments
- [2311.11045] Orca 2: Teaching Small Language Models How to Reason https://arxiv.org/abs/2311.11045 81 comments
- AI and Open Source in 2023 - by Sebastian Raschka, PhD https://magazine.sebastianraschka.com/p/ai-and-open-source-in-2023 67 comments
- [2309.05463] Textbooks Are All You Need II: phi-1.5 technical report https://arxiv.org/abs/2309.05463 65 comments
- GitHub - QwenLM/Qwen: The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud. https://github.com/QwenLM/Qwen 51 comments
- [2303.17564] BloombergGPT: A Large Language Model for Finance https://arxiv.org/abs/2303.17564 47 comments
- [2305.11206] LIMA: Less Is More for Alignment https://arxiv.org/abs/2305.11206 44 comments
- Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation) https://magazine.sebastianraschka.com/p/practical-tips-for-finetuning-llms 37 comments
- [2006.16668] GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding https://arxiv.org/abs/2006.16668 35 comments
- Llama access request form - Meta AI https://ai.meta.com/resources/models-and-libraries/llama-downloads/ 17 comments
- LLM Training: RLHF and Its Alternatives https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives 14 comments
- [1506.02640] You Only Look Once: Unified, Real-Time Object Detection http://arxiv.org/abs/1506.02640 8 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- [2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling https://arxiv.org/abs/2304.01373 7 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:magazine.sebastianraschka.com
Search title: 10 Noteworthy AI Research Papers of 2023
See how to search.