[2501.17161] SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training - discu.eu

Hacker News

SFT Memorizes,RL Generalizes: Comparative Study of Foundation Model PostTraining https://arxiv.org/abs/2501.17161 0 comments 31/1/2025

Reddit

"SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training", Chu et al 2025 https://arxiv.org/abs/2501.17161 4 comments 1/2/2025 reinforcementlearning

Linking pages

DeepSeek-R1's strong zero-shot performance on agentic tasks - Martin Krasser's Blog http://krasserm.github.io/2025/02/05/deepseek-r1-agent/ 0 comments

Related searches:

Search whole site: site:arxiv.org

Search title: [2501.17161] SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

See how to search.

Submit link to: