Hacker News
- Faster and Smaller Whisper: A Deep Dive into Quantization and Torch Compilation https://mobiusml.github.io/whisper-static-cache-blog/ 0 comments
- Towards 1-bit Machine Learning Models https://mobiusml.github.io/1bit_blog/ 157 comments
- Low-Rank Pruning of Llama2 https://mobiusml.github.io/low-rank-llama2/ 3 comments