- Why most AI benchmarks tell us so little https://techcrunch.com/2024/03/07/heres-why-most-ai-benchmarks-tell-us-so-little/ 4 comments artificial
Linking pages
- Snowflake releases a flagship generative AI model of its own | TechCrunch https://techcrunch.com/2024/04/24/snowflake-releases-a-flagship-generative-ai-model-of-its-own/ 1 comment
- This Week in AI: Midjourney bets it can beat the copyright police | TechCrunch https://techcrunch.com/2024/03/16/this-week-in-ai-midjourney-bets-it-can-beat-the-copyright-police/ 0 comments
- Google Gemini unexpectedly surges to No. 1, over OpenAI, but benchmarks don't tell the whole story | VentureBeat https://venturebeat.com/ai/google-gemini-unexpectedly-surges-to-no-1-over-openai-but-benchmarks-dont-tell-the-whole-story/ 0 comments
Linked pages
- OpenAI built a text generator so good, it's considered too dangerous to release | TechCrunch https://techcrunch.com/2019/02/17/openai-text-generator-dangerous/ 59 comments
- HellaSwag or HellaBad? 36% of this popular LLM benchmark contains errors https://www.surgehq.ai/blog/hellaswag-or-hellabad-36-of-this-popular-llm-benchmark-contains-errors 14 comments
- Anthropic launches Claude, a chatbot to rival OpenAI's ChatGPT | TechCrunch https://techcrunch.com/2023/03/14/anthropic-launches-claude-a-chatbot-to-rival-openais-chatgpt/ 1 comment
- Okay, the GPT-3 hype seems pretty reasonable • TechCrunch https://techcrunch.com/2021/03/17/okay-the-gpt-3-hype-seems-pretty-reasonable/ 0 comments
- Meta releases Llama 2, a more 'helpful' set of text-generating models | TechCrunch https://techcrunch.com/2023/07/18/meta-releases-llama-2-a-more-helpful-set-of-text-generating-models/ 0 comments
- We tested Anthropic's new chatbot -- and came away a bit disappointed | TechCrunch https://techcrunch.com/2024/03/07/we-tested-anthropics-new-chatbot-and-came-away-a-bit-disappointed/ 0 comments
Related searches:
Search whole site: site:techcrunch.com
Search title: Why most AI benchmarks tell us so little | TechCrunch
See how to search.