[2305.14342] Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training - discu.eu

Hacker News

Sophia: Scalable Stochastic 2nd-Order Optimizer for Language Model Pre-Training https://arxiv.org/abs/2305.14342 2 comments 7/4/2024

Reddit

[R] Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training https://arxiv.org/abs/2305.14342 6 comments 26/5/2023 machinelearning

Linking pages

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:arxiv.org

Search title: [2305.14342] Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

See how to search.

Submit link to: