Hacker News
Linking pages
- Cutting-edge Chinese “reasoning” model rivals OpenAI o1—and it’s free to download - Ars Technica https://arstechnica.com/ai/2025/01/china-is-catching-up-with-americas-best-reasoning-ai-models/ 139 comments
- The 2025 AI Engineering Reading List - Latent Space https://www.latent.space/p/2025-papers 68 comments
- DeepSeek-V3 Technical Report https://arxiv.org/html/2412.19437v1 42 comments
- DeepSeek V3 and the cost of frontier AI models https://www.interconnects.ai/p/deepseek-v3-and-the-actual-cost-of 5 comments
- Is finetuning GPT4o worth it? - Latent Space https://www.latent.space/p/cosine 0 comments
- Gru.ai Ranks First in OpenAI’s Latest SWE-Bench Verified Evaluation https://gru.ai/blog/Gru-Rank-First/ 0 comments
- The new Claude 3.5 Sonnet, Computer Use, and Building SOTA Agents — with Erik Schluntz, Anthropic https://www.latent.space/p/claude-sonnet 0 comments
- AI #97: 4 - by Zvi Mowshowitz - Don't Worry About the Vase https://thezvi.substack.com/p/ai-97-4 0 comments
- Scaling Laws for LLMs: From GPT-3 to o3 https://cameronrwolfe.substack.com/p/llm-scaling-laws 0 comments
- Modal Sandboxes are generally available | Modal Blog https://modal.com/blog/sandbox-launch 0 comments
Related searches:
Search whole site: site:openai.com
Search title: SWE-Bench Verified
See how to search.