GitHub - sail-sg/understand-r1-zero: Understanding R1-Zero-Like Training: A Critical Perspective - discu.eu

Hacker News

Understanding R1-Zero-Like Training: A Critical Perspective https://github.com/sail-sg/understand-r1-zero 21 comments 22/3/2025

Linking pages

Sea AI Lab Researchers Introduce Dr. GRPO: A Bias-Free Reinforcement Learning Method that Enhances Math Reasoning Accuracy in Large Language Models Without Inflating Responses - MarkTechPost https://www.marktechpost.com/2025/03/22/sea-ai-lab-researchers-introduce-dr-grpo-a-bias-free-reinforcement-learning-method-that-enhances-math-reasoning-accuracy-in-large-language-models-without-inflating-responses/ 1 comment

Linked pages

Related searches:

Search whole site: site:github.com

Search title: GitHub - sail-sg/understand-r1-zero: Understanding R1-Zero-Like Training: A Critical Perspective

See how to search.

Submit link to: