GitHub - openai/simple-evals - discu.eu

Linking pages

Linked pages

Introducing the next generation of Claude \ Anthropic https://www.anthropic.com/news/claude-3-family 704 comments
Introducing Meta Llama 3: The most capable openly available LLM to date https://ai.meta.com/blog/meta-llama-3/ 19 comments
GitHub - openai/evals https://github.com/openai/evals 16 comments
[2107.03374] Evaluating Large Language Models Trained on Code https://arxiv.org/abs/2107.03374 8 comments
Gemini API Pricing | Google AI for Developers | Google for Developers https://ai.google.dev/pricing 5 comments
[2009.03300] Measuring Massive Multitask Language Understanding https://arxiv.org/abs/2009.03300 0 comments
[2103.03874] Measuring Mathematical Problem Solving With the MATH Dataset https://arxiv.org/abs/2103.03874 0 comments
[1903.00161] DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs https://arxiv.org/abs/1903.00161 0 comments
[2311.12022] GPQA: A Graduate-Level Google-Proof Q&A Benchmark https://arxiv.org/abs/2311.12022 0 comments

Related searches:

Search whole site: site:github.com

Search title: GitHub - openai/simple-evals

See how to search.

Submit link to: