Hacker News
- Training Language Models to Self-Correct via Reinforcement Learning https://arxiv.org/abs/2409.12917 92 comments
Linking pages
- GitHub - srush/awesome-o1: A bibliography and survey of the papers surrounding o1 https://github.com/srush/awesome-o1 1 comment
- GitHub - gabrielchua/daily-ai-papers: All credits go to HuggingFace's Daily AI papers (https://huggingface.co/papers) and the research community. 🔉Audio summaries here (https://t.me/daily_ai_papers). https://github.com/gabrielchua/daily-ai-papers 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [2409.12917] Training Language Models to Self-Correct via Reinforcement Learning
See how to search.