Hacker News
- Optimizing a WebGPU Matmul Kernel for 1 TFLOP https://zanussbaum.substack.com/p/optimizing-a-webgpu-matmul-kernel 80 comments
Linked pages
- Javascript and the next decade of data programming | Ben Schmidt http://benschmidt.org/post/2020-01-15/2020-01-15-webgpu/ 60 comments
- How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog https://siboehm.com/articles/22/CUDA-MMM 49 comments
- GitHub - nomic-ai/deepscatter: Zoomable, animated scatterplots in the browser that scales over a billion points https://github.com/nomic-ai/deepscatter 16 comments
- WebGPU Shading Language https://www.w3.org/TR/WGSL/#sync-builtin-functions 8 comments
- Matrix multiplication - Wikipedia https://en.wikipedia.org/wiki/Matrix_multiplication#Outer_product 4 comments
- Loop unrolling - Wikipedia https://en.wikipedia.org/wiki/Loop_unrolling 1 comment
- WebGPU https://www.w3.org/TR/webgpu/ 1 comment
- GitHub - karpathy/micrograd: A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API https://github.com/karpathy/micrograd 0 comments
- GitHub - 0hq/WebGPT: Run GPT model on the browser with WebGPU. An implementation of GPT inference in less than ~1500 lines of vanilla Javascript. https://github.com/0hq/WebGPT 0 comments
- Ben Schmidt https://benschmidt.org/post/2023-03-07-webGPU-day/ 0 comments
- GitHub - tinygrad/tinygrad: You like pytorch? You like micrograd? You love tinygrad! ❤️ https://github.com/tinygrad/tinygrad 0 comments
Related searches:
Search whole site: site:zanussbaum.substack.com
Search title: Optimizing a WebGPU Matmul Kernel for 1TFLOP+ Performance
See how to search.