Hacker News
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling https://github.com/deepseek-ai/DeepGEMM 67 comments
Linking pages
- DeepGEMM: DeepSeek Unveils High-Performance Matrix Multiplication Library on Day 3 of Open Source Week https://xyzlabs.substack.com/p/deepgemm-deepseek-unveils-high-performance 0 comments
- Modular: Democratizing AI Compute, Part 6: What about AI compilers (TVM and XLA)? https://www.modular.com/blog/democratizing-ai-compute-part-6-what-about-ai-compilers 0 comments
Linked pages
- GitHub - deepseek-ai/DeepEP: DeepEP: an efficient expert-parallel communication library https://github.com/deepseek-ai/DeepEP 71 comments
- GitHub - deepseek-ai/DeepSeek-V3 https://github.com/deepseek-ai/DeepSeek-V3 40 comments
- PTX ISA :: CUDA Toolkit Documentation https://docs.nvidia.com/cuda/parallel-thread-execution/index.html 4 comments
- GitHub - NVIDIA/cutlass: CUDA Templates for Linear Algebra Subroutines https://github.com/NVIDIA/cutlass 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - deepseek-ai/DeepGEMM: DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
See how to search.