Berkeley Function Calling Leaderboard (aka Berkeley Tool Calling Leaderboard) - discu.eu

Linking pages

The 2025 AI Engineering Reading List - Latent Space https://www.latent.space/p/2025-papers 69 comments
AI leaderboards are no longer useful. It's time to switch to Pareto curves. https://www.aisnakeoil.com/p/ai-leaderboards-are-no-longer-useful 14 comments
GitHub - BoundaryML/baml: BAML is a language that helps you get structured data from LLMs, with the best DX possible. Works with all languages. Check out the promptfiddle.com playground https://github.com/BoundaryML/baml 2 comments
GitHub - qx-labs/agents-deep-research: An implementation of iterative deep research using the OpenAI Agents SDK https://github.com/qx-labs/agents-deep-research 2 comments
Top 12 Trending LLM Leaderboards: A Guide to Leading AI Models' Evaluation - MarkTechPost https://www.marktechpost.com/2024/06/02/top-12-trending-llm-leaderboards-a-guide-to-leading-ai-models-evaluation/ 1 comment
GitHub - alopatenko/LLMEvaluation: A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use cases, promote the adoption of best practices in LLM assessment, and critically assess the effectiveness of these evaluation methods. https://github.com/alopatenko/LLMEvaluation 0 comments
aie-book/resources.md at main · chiphuyen/aie-book · GitHub https://github.com/chiphuyen/aie-book/blob/main/resources.md 0 comments

Related searches:

Search whole site: site:gorilla.cs.berkeley.edu

Search title: Berkeley Function Calling Leaderboard (aka Berkeley Tool Calling Leaderboard)

See how to search.

Submit link to: