Linking pages
Linked pages
- Introducing the next generation of Claude \ Anthropic https://www.anthropic.com/news/claude-3-family 704 comments
- Introducing Meta Llama 3: The most capable openly available LLM to date https://ai.meta.com/blog/meta-llama-3/ 19 comments
- GitHub - openai/evals https://github.com/openai/evals 16 comments
- [2107.03374] Evaluating Large Language Models Trained on Code https://arxiv.org/abs/2107.03374 8 comments
- Gemini API Pricing | Google AI for Developers | Google for Developers https://ai.google.dev/pricing 5 comments
- [2009.03300] Measuring Massive Multitask Language Understanding https://arxiv.org/abs/2009.03300 0 comments
- [2103.03874] Measuring Mathematical Problem Solving With the MATH Dataset https://arxiv.org/abs/2103.03874 0 comments
- [1903.00161] DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs https://arxiv.org/abs/1903.00161 0 comments
- [2311.12022] GPQA: A Graduate-Level Google-Proof Q&A Benchmark https://arxiv.org/abs/2311.12022 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - openai/simple-evals
See how to search.