DeepSeek-R1's strong zero-shot performance on agentic tasks - Martin Krasser's Blog - discu.eu

Hacker News

DeepSeek-R1's strong zero-shot performance on agentic tasks http://krasserm.github.io/2025/02/05/deepseek-r1-agent/ 0 comments 5/2/2025

Linked pages

[2501.12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning https://arxiv.org/abs/2501.12948 1057 comments
https://openai.com/index/introducing-deep-research/ 422 comments
o1 isn’t a chat model (and that’s the point) https://www.latent.space/p/o1-skill-issue 145 comments
Disqus – The #1 way to build your audience https://disqus.com 32 comments
deepseek-ai/DeepSeek-R1 · Hugging Face https://huggingface.co/deepseek-ai/DeepSeek-R1 6 comments
[2501.17161] SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training https://arxiv.org/abs/2501.17161 4 comments
Introducing smolagents: simple agents that write actions in code. https://huggingface.co/blog/smolagents 2 comments
GitHub - gradion-ai/freeact: freeact is a lightweight library for code-action based agents https://github.com/gradion-ai/freeact 2 comments
Open-source DeepResearch – Freeing our search agents https://huggingface.co/blog/open-deep-research 1 comment
[2501.18585] Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs https://arxiv.org/abs/2501.18585 0 comments

Related searches:

Search whole site: site:krasserm.github.io

Search title: DeepSeek-R1's strong zero-shot performance on agentic tasks - Martin Krasser's Blog

See how to search.

Submit link to: