- [D] DeepSeek distillation and training costs https://arxiv.org/html/2412.19437v1 42 comments machinelearning
Linking pages
- How China’s DeepSeek AI Chatbot Became an Overnight Success - The Atlantic https://www.theatlantic.com/technology/archive/2025/01/deepseek-china-ai/681481/ 599 comments
- DeepSeek’s AI is bad for OpenAI and NVIDIA. But it might be great for you. | Vox https://www.vox.com/technology/397330/deepseek-openai-chatgpt-gemini-nvidia-china 412 comments
- Dario Amodei — On DeepSeek and Export Controls https://darioamodei.com/on-deepseek-and-export-controls 103 comments
- Chinese AI App DeepSeek Soars in Popularity, Startling Rivals | WIRED https://www.wired.com/story/deepseek-app-popular-viral/ 9 comments
- Novus Ordo Seclorum - by Dean W. Ball - Hyperdimensional https://www.hyperdimensional.co/p/novus-ordo-seclorum 1 comment
- DeepSeek AI Surpasses ChatGPT As Most Downloaded App, Challenges Nvidia With Cost-Effective Technology https://techcrawlr.com/deepseek-ai-surpasses-chatgpt-as-most-downloaded-app-challenges-nvidia-with-cost-effective-technology/ 0 comments
- What DeepSeek's AI Did That Everyone Else's Didn't https://gizmodo.com/what-deepseeks-ai-did-that-everyone-else-didnt-2000555731 0 comments
- DeepSeek vs conspiracies - Tereza Tizkova https://terezatizkova.substack.com/p/deepseek-vs-conspiracies 0 comments
- I don’t believe DeepSeek crashed Nvidia’s stock https://www.understandingai.org/p/i-dont-believe-deepseek-crashed-nvidias 0 comments
- DeepSeek: What lies under the bonnet of the new AI chatbot? https://www.bbc.com/future/article/20250131-what-does-deepseeks-new-app-mean-for-the-future-of-ai 0 comments
Linked pages
- https://openai.com/index/hello-gpt-4o/ 2481 comments
- Introducing Gemini 1.5, Google's next-generation AI model https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/ 715 comments
- Introducing Claude 3.5 Sonnet \ Anthropic https://www.anthropic.com/news/claude-3-5-sonnet 289 comments
- Cheaper, Better, Faster, Stronger | Mistral AI | Frontier AI in your hands https://mistral.ai/news/mixtral-8x22b/ 243 comments
- Codeforces http://codeforces.com/ 103 comments
- https://openai.com/index/introducing-simpleqa/ 84 comments
- GitHub - deepseek-ai/DeepSeek-V3 https://github.com/deepseek-ai/DeepSeek-V3 40 comments
- Qwen2.5: A Party of Foundation Models! | Qwen https://qwenlm.github.io/blog/qwen2.5/ 38 comments
- llama3/MODEL_CARD.md at main · meta-llama/llama3 · GitHub https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md 17 comments
- https://openai.com/index/introducing-swe-bench-verified/ 10 comments
- [2107.03374] Evaluating Large Language Models Trained on Code https://arxiv.org/abs/2107.03374 8 comments
- https://arxiv.org/abs/2101.03961 4 comments
- Home | aider https://aider.chat/ 3 comments
- Introducing Qwen1.5 | Qwen https://qwenlm.github.io/blog/qwen1.5/ 3 comments
- GitHub - NVIDIA/cutlass: CUDA Templates for Linear Algebra Subroutines https://github.com/NVIDIA/cutlass 0 comments
- Blackwell Architecture for Generative AI | NVIDIA https://www.nvidia.com/en-us/data-center/technologies/blackwell-architecture/ 0 comments
- GitHub - openai/simple-evals https://github.com/openai/simple-evals 0 comments
- llama-models/models/llama3_1/MODEL_CARD.md at main · meta-llama/llama-models · GitHub https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md 0 comments
- openai/MMMLU · Datasets at Hugging Face https://huggingface.co/datasets/openai/MMMLU 0 comments
- GitHub - NVIDIA/TransformerEngine: A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference. https://github.com/NVIDIA/TransformerEngine 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:arxiv.org
Search title: DeepSeek-V3 Technical Report
See how to search.