Hacker News
- DeepSeek-R1's strong zero-shot performance on agentic tasks http://krasserm.github.io/2025/02/05/deepseek-r1-agent/ 0 comments
Linked pages
- [2501.12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning https://arxiv.org/abs/2501.12948 1057 comments
- https://openai.com/index/introducing-deep-research/ 422 comments
- o1 isn’t a chat model (and that’s the point) https://www.latent.space/p/o1-skill-issue 145 comments
- Disqus – The #1 way to build your audience https://disqus.com 32 comments
- deepseek-ai/DeepSeek-R1 · Hugging Face https://huggingface.co/deepseek-ai/DeepSeek-R1 6 comments
- [2501.17161] SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training https://arxiv.org/abs/2501.17161 4 comments
- Introducing smolagents: simple agents that write actions in code. https://huggingface.co/blog/smolagents 2 comments
- GitHub - gradion-ai/freeact: freeact is a lightweight library for code-action based agents https://github.com/gradion-ai/freeact 2 comments
- Open-source DeepResearch – Freeing our search agents https://huggingface.co/blog/open-deep-research 1 comment
- [2501.18585] Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs https://arxiv.org/abs/2501.18585 0 comments
Related searches:
Search whole site: site:krasserm.github.io
Search title: DeepSeek-R1's strong zero-shot performance on agentic tasks - Martin Krasser's Blog
See how to search.