- [P] Noteworthy LLM Research Papers of 2024 (Part Two): July to December https://magazine.sebastianraschka.com/p/ai-research-papers-2024-part-2 0 comments machinelearning
Linked pages
- [2407.21075] Apple Intelligence Foundation Language Models https://arxiv.org/abs/2407.21075 42 comments
- GitHub - deepseek-ai/DeepSeek-V3 https://github.com/deepseek-ai/DeepSeek-V3 40 comments
- Build a Large Language Model (From Scratch): Raschka, Sebastian: 9781633437166: Amazon.com: Books https://www.amazon.com/dp/1633437167/ 38 comments
- Noteworthy AI Research Papers of 2024 (Part One) https://magazine.sebastianraschka.com/p/ai-research-papers-2024-part-1 21 comments
- New LLM Pre-training and Post-training Paradigms https://magazine.sebastianraschka.com/p/new-llm-pre-training-and-post-training 2 comments
- DeepSeek-V3/DeepSeek_V3.pdf at main · deepseek-ai/DeepSeek-V3 · GitHub https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf 2 comments
- [2406.07522] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling https://arxiv.org/abs/2406.07522 1 comment
- [2408.03314] Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters https://arxiv.org/abs/2408.03314 1 comment
- [2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
- [2407.10671] Qwen2 Technical Report https://arxiv.org/abs/2407.10671 0 comments
- llama-models/models/llama3_1/MODEL_CARD.md at main · meta-llama/llama-models · GitHub https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md 0 comments
- [2407.21783] The Llama 3 Herd of Models https://arxiv.org/abs/2407.21783 0 comments
- [2408.15237] The Mamba in the Llama: Distilling and Accelerating Hybrid Models https://arxiv.org/abs/2408.15237 0 comments
- [2409.11402] NVLM: Open Frontier-Class Multimodal LLMs https://arxiv.org/abs/2409.11402 0 comments
- Breaking Down The Numbers: How Much Data Does The World Create Daily in 2024? | Edge Delta https://edgedelta.com/company/blog/how-much-data-is-created-per-day 0 comments
- [2410.18982] O1 Replication Journey: A Strategic Progress Report -- Part 1 https://arxiv.org/abs/2410.18982 0 comments
- Understanding Multimodal LLMs - by Sebastian Raschka, PhD https://magazine.sebastianraschka.com/p/understanding-multimodal-llms 0 comments
- [2411.04330] Scaling Laws for Precision https://arxiv.org/abs/2411.04330 0 comments
- [2412.08905] Phi-4 Technical Report https://arxiv.org/abs/2412.08905 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:magazine.sebastianraschka.com
Search title: Noteworthy AI Research Papers of 2024 (Part Two)
See how to search.