Linking pages
Linked pages
- [2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs https://arxiv.org/abs/2305.14314 129 comments
- [2301.00774] SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot https://arxiv.org/abs/2301.00774 128 comments
- [2305.15717] The False Promise of Imitating Proprietary LLMs https://arxiv.org/abs/2305.15717 119 comments
- [2305.02301] Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes https://arxiv.org/abs/2305.02301 56 comments
- GitHub - artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs https://github.com/artidoro/qlora 5 comments
- [2306.00978] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration https://arxiv.org/abs/2306.00978 2 comments
- [2306.03078] SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression https://arxiv.org/abs/2306.03078 2 comments
- [2304.14402] LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions https://arxiv.org/abs/2304.14402 1 comment
- GitHub - mbzuai-nlp/LaMini-LM: LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions https://github.com/mbzuai-nlp/LaMini-LM 1 comment
- GitHub - flexflow/FlexFlow at inference https://github.com/flexflow/FlexFlow/tree/inference 1 comment
- GitHub - Vahe1994/SpQR https://github.com/Vahe1994/SpQR 1 comment
- [2306.07629] SqueezeLLM: Dense-and-Sparse Quantization https://arxiv.org/abs/2306.07629 1 comment
- GitHub - qwopqwop200/GPTQ-for-LLaMa: 4 bits quantization of LLaMa using GPTQ https://github.com/qwopqwop200/GPTQ-for-LLaMa 0 comments
- [2210.17323] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers https://arxiv.org/abs/2210.17323 0 comments
- [2305.16635] Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing https://arxiv.org/abs/2305.16635 0 comments
- GitHub - mit-han-lab/llm-awq: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration https://github.com/mit-han-lab/llm-awq 0 comments
- GitHub - SqueezeAILab/SqueezeLLM: SqueezeLLM: Dense-and-Sparse Quantization https://github.com/SqueezeAILab/SqueezeLLM 0 comments
- [2306.11695] A Simple and Effective Pruning Approach for Large Language Models https://arxiv.org/abs/2306.11695 0 comments
- GitHub - mit-han-lab/smoothquant: [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models https://github.com/mit-han-lab/smoothquant 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - horseee/Awesome-Efficient-LLM: A curated list for Efficient Large Language Models
See how to search.