Hacker News
- Understanding Llama 2 and the New Code Llama LLMs https://magazine.sebastianraschka.com/p/ahead-of-ai-11-new-foundation-models 34 comments
Linking pages
- AI and Open Source in 2023 - by Sebastian Raschka, PhD https://magazine.sebastianraschka.com/p/ai-and-open-source-in-2023 67 comments
- How Good Are the Latest Open LLMs? And Is DPO Better Than PPO? https://magazine.sebastianraschka.com/p/how-good-are-the-latest-open-llms 1 comment
- LLMs Hallucinate? - HackerPulse Dispatch https://hackerpulse.substack.com/p/llms-hallucinate 0 comments
Linked pages
- https://www.npr.org/2023/08/16/1194202562/new-york-times-considers-legal-action-against-openai-as-copyright-tensions-swirl 426 comments
- Nvidia GPU shortage is 'top gossip' of Silicon Valley | VentureBeat https://venturebeat.com/ai/nvidia-gpu-shortage-is-top-gossip-of-silicon-valley/ 399 comments
- GPT-3.5 Turbo fine-tuning and API updates https://openai.com/blog/gpt-3-5-turbo-fine-tuning-and-api-updates 233 comments
- [2307.09009] How is ChatGPT's behavior changing over time? https://arxiv.org/abs/2307.09009 184 comments
- [2305.13048] RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/abs/2305.13048 171 comments
- GitHub - karpathy/llama2.c: Inference Llama 2 in pure C, single file, fp32, haha https://github.com/karpathy/llama2.c 167 comments
- [2304.11062] Scaling Transformer to 1M tokens and beyond with RMT https://arxiv.org/abs/2304.11062 153 comments
- Nvidia's new A.I. chip claims it will drop the costs of running LLMs https://www.cnbc.com/2023/08/08/nvidia-reveals-new-ai-chip-says-cost-of-running-large-language-models-will-drop-significantly-.html 111 comments
- [2307.02486] LongNet: Scaling Transformers to 1,000,000,000 Tokens https://arxiv.org/abs/2307.02486 98 comments
- Revealed: The Authors Whose Pirated Books Are Powering Generative AI - The Atlantic https://www.theatlantic.com/technology/archive/2023/08/books3-ai-meta-llama-pirated-books/675063/ 86 comments
- [2101.00027] The Pile: An 800GB Dataset of Diverse Text for Language Modeling https://arxiv.org/abs/2101.00027 81 comments
- GPT-J-6B: 6B JAX-Based Transformer – Aran Komatsuzaki https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/ 79 comments
- [2303.17564] BloombergGPT: A Large Language Model for Finance https://arxiv.org/abs/2303.17564 47 comments
- https://arxiv.org/abs/2307.08621 36 comments
- Llama access request form - Meta AI https://ai.meta.com/resources/models-and-libraries/llama-downloads/ 17 comments
- [2108.07258] On the Opportunities and Risks of Foundation Models https://arxiv.org/abs/2108.07258 11 comments
- [2107.03374] Evaluating Large Language Models Trained on Code https://arxiv.org/abs/2107.03374 8 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- [2307.03718] Frontier AI Regulation: Managing Emerging Risks to Public Safety https://arxiv.org/abs/2307.03718 7 comments
- Fair use - Wikipedia http://en.wikipedia.org/wiki/Fair_use 5 comments
Related searches:
Search whole site: site:magazine.sebastianraschka.com
Search title: Ahead of AI #11: New Foundation Models
See how to search.