- [P] New LLM Pre-training and Post-training Paradigms: Comparing Qwen 2, Llama 3.1, Gemma 2, and Apple's FMs https://magazine.sebastianraschka.com/p/new-llm-pre-training-and-post-training 2 comments machinelearning
Linking pages
Linked pages
- [2407.21075] Apple Intelligence Foundation Language Models https://arxiv.org/abs/2407.21075 42 comments
- https://www.youtube.com/watch?v=kPGTx4wcm_w 12 comments
- LLMs-from-scratch/ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb at main · rasbt/LLMs-from-scratch · GitHub https://github.com/rasbt/LLMs-from-scratch/blob/main/ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb 5 comments
- Build a Large Language Model (From Scratch) https://www.manning.com/books/build-a-large-language-model-from-scratch 0 comments
- Research Papers in January 2024 - by Sebastian Raschka, PhD https://magazine.sebastianraschka.com/p/research-papers-in-january-2024 0 comments
- [2407.10671] Qwen2 Technical Report https://arxiv.org/abs/2407.10671 0 comments
- [2407.21783] The Llama 3 Herd of Models https://arxiv.org/abs/2407.21783 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:magazine.sebastianraschka.com
Search title: New LLM Pre-training and Post-training Paradigms
See how to search.