Hacker News
- Nvidia Outperforms GPT-4o with Open Source Model https://github.com/lmarena/arena-hard-auto 3 comments
Linking pages
- 500K+ Evaluations by Neural Magic Show Quantized LLMs Retain Accuracy https://neuralmagic.com/blog/we-ran-over-half-a-million-evaluations-on-quantized-llms-heres-what-we-found/ 2 comments
- GitHub - mlfoundations/evalchemy: Automatic evals for LLMs https://github.com/mlfoundations/evalchemy 0 comments
- GitHub - NovaSky-AI/SkyThought: Sky-T1: Train your own O1 preview model within $450 https://github.com/NovaSky-AI/SkyThought 0 comments
Linked pages
- Git Large File Storage | Git Large File Storage (LFS) replaces large files such as audio samples, videos, datasets, and graphics with text pointers inside Git, while storing the file contents on a remote server like GitHub.com or GitHub Enterprise. https://git-lfs.com/ 2 comments
- Introducing Hard Prompts Category in Chatbot Arena | LMSYS Org https://lmsys.org/blog/2024-05-17-category-hard/ 0 comments
- Does style matter? Disentangling style and substance in Chatbot Arena | LMSYS Org https://lmsys.org/blog/2024-08-28-style-control/ 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - lmarena/arena-hard-auto: Arena-Hard-Auto: An automatic LLM benchmark.
See how to search.