Hacker News
- Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference https://cerebras.ai/blog/llama-405b-inference 71 comments
Linking pages
- Cerebras video shows AI writing code 75x faster than world's fastest AI GPU cloud — world's largest chip beats AWS's fastest in head-to-head comparison | Tom's Hardware https://www.tomshardware.com/tech-industry/artificial-intelligence/cerebras-video-shows-ai-writing-code-75x-faster-than-worlds-fastest-ai-gpu-cloud-worlds-largest-chip-beats-awss-fastest-in-head-to-head-comparison 0 comments
Related searches:
Search whole site: site:cerebras.ai
Search title: Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference - Cerebras
See how to search.