Linking pages
- AI leaderboards are no longer useful. It's time to switch to Pareto curves. https://www.aisnakeoil.com/p/ai-leaderboards-are-no-longer-useful 14 comments
- Beat GPT-4o at Python by searching with 100 dumb LLaMAs | Modal Blog https://modal.com/blog/llama-human-eval 2 comments
- AutoDev: Automated AI-Driven Development https://arxiv.org/html/2403.08299v1 0 comments
- Technical Report: Building Genie https://cosine.sh/blog/genie-technical-report 0 comments
- Just-in-time programming | Riza Blog https://riza.io/blog/just-in-time-programming 0 comments
Related searches:
Search whole site: site:paperswithcode.com
Search title: HumanEval Benchmark (Code Generation) | Papers With Code
See how to search.