Hacker News
- Matrix Multiplications on GPUs Run Faster When Given Predictable Data https://www.thonking.ai/p/strangely-matrix-multiplications 2 comments
Linking pages
- 100k H100 Clusters: Power, Network Topology, Ethernet vs InfiniBand, Reliability, Failures, Checkpointing https://www.semianalysis.com/p/100000-h100-clusters-power-network 3 comments
- How To Write A Fast Matrix Multiplication From Scratch With Tensor Cores | Alex Armbruster https://alexarmbr.github.io/2024/08/10/How-To-Write-A-Fast-Matrix-Multiplication-From-Scratch-With-Tensor-Cores.html 0 comments
- FireAttention V3: Enabling AMD as a Viable Alternative for GPU Inference https://fireworks.ai/blog/fireattention-v3 0 comments
- Outperforming cuBLAS on H100: a Worklog https://cudaforfun.substack.com/p/outperforming-cublas-on-h100-a-worklog 0 comments
Linked pages
Related searches:
Search whole site: site:www.thonking.ai
Search title: Strangely, Matrix Multiplications on GPUs Run Faster When Given "Predictable" Data! [short]
See how to search.