Linking pages
Linked pages
- Introducing the next generation of Claude \ Anthropic https://www.anthropic.com/news/claude-3-family 704 comments
- Introducing DBRX: A New State-of-the-Art Open LLM | Databricks https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm 343 comments
- [2403.04732] How Far Are We from Intelligent Visual Deductive Reasoning? https://arxiv.org/abs/2403.04732 118 comments
- [2403.05440] Is Cosine-Similarity of Embeddings Really About Similarity? https://arxiv.org/abs/2403.05440 115 comments
- 🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers 81 comments
- [2403.18802] Long-form factuality in large language models https://arxiv.org/abs/2403.18802 76 comments
- [2403.09611] MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training https://arxiv.org/abs/2403.09611 63 comments
- [2403.07815] Chronos: Learning the Language of Time Series https://arxiv.org/abs/2403.07815 60 comments
- [2403.06634] Stealing Part of a Production Language Model https://arxiv.org/abs/2403.06634 51 comments
- [2403.03853] ShortGPT: Layers in Large Language Models are More Redundant Than You Expect https://arxiv.org/abs/2403.03853 25 comments
- [2403.05286] LLM4Decompile: Decompiling Binary Code with Large Language Models https://arxiv.org/abs/2403.05286 16 comments
- LLM Training: RLHF and Its Alternatives https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives 14 comments
- GitHub - xai-org/grok-1: Grok open release https://github.com/xai-org/grok-1 12 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- GitHub - hpcaitech/Open-Sora: Open-Sora: Democratizing Efficient Video Production for All https://github.com/hpcaitech/Open-Sora 8 comments
- [2403.12173] TnT-LLM: Text Mining at Scale with Large Language Models https://arxiv.org/abs/2403.12173 7 comments
- [2403.18814] Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models https://arxiv.org/abs/2403.18814 7 comments
- [2403.05530] Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context https://arxiv.org/abs/2403.05530 1 comment
- [2403.08763] Simple and Scalable Strategies to Continually Pre-train Large Language Models https://arxiv.org/abs/2403.08763 1 comment
- GitHub - Lightning-AI/lightning-thunder: Source to source compiler for PyTorch. It makes PyTorch programs faster on single accelerators and distributed. https://github.com/Lightning-AI/lightning-thunder 1 comment
Related searches:
Search whole site: site:magazine.sebastianraschka.com
Search title: Tips for LLM Pretraining and Evaluating Reward Models
See how to search.