Linking pages
- AI leaderboards are no longer useful. It's time to switch to Pareto curves. https://www.aisnakeoil.com/p/ai-leaderboards-are-no-longer-useful 14 comments
- GitHub - alopatenko/LLMEvaluation: A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use cases, promote the adoption of best practices in LLM assessment, and critically assess the effectiveness of these evaluation methods. https://github.com/alopatenko/LLMEvaluation 0 comments
- Short Musings on AI Engineering and "Failed AI Projects" https://www.sh-reya.com/blog/ai-engineering-short/ 0 comments
- Aligning LLM-as-a-Judge with Human Preferences https://blog.langchain.dev/aligning-llm-as-a-judge-with-human-preferences/ 0 comments
- Data Flywheels for LLM Applications https://www.sh-reya.com/blog/ai-engineering-flywheel/ 0 comments
Related searches:
Search whole site: site:arxiv.org
Search title: [2404.12272] Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences
See how to search.