GitHub - horseee/Awesome-Efficient-LLM: A curated list for Efficient Large Language Models

Linking pages

GitHub - zhimin-z/awesome-awesome-artificial-intelligence: A curated list of awesome curated lists of many topics closely related to artificial intelligence. https://github.com/zhimin-z/awesome-awesome-artificial-intelligence 15 comments

Linked pages

[2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs https://arxiv.org/abs/2305.14314 129 comments
[2301.00774] SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot https://arxiv.org/abs/2301.00774 128 comments
[2305.15717] The False Promise of Imitating Proprietary LLMs https://arxiv.org/abs/2305.15717 119 comments
[2305.02301] Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes https://arxiv.org/abs/2305.02301 56 comments
GitHub - artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs https://github.com/artidoro/qlora 5 comments
[2306.00978] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration https://arxiv.org/abs/2306.00978 2 comments
[2306.03078] SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression https://arxiv.org/abs/2306.03078 2 comments
[2304.14402] LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions https://arxiv.org/abs/2304.14402 1 comment
GitHub - mbzuai-nlp/LaMini-LM: LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions https://github.com/mbzuai-nlp/LaMini-LM 1 comment
GitHub - flexflow/FlexFlow at inference https://github.com/flexflow/FlexFlow/tree/inference 1 comment
GitHub - Vahe1994/SpQR https://github.com/Vahe1994/SpQR 1 comment
[2306.07629] SqueezeLLM: Dense-and-Sparse Quantization https://arxiv.org/abs/2306.07629 1 comment
GitHub - qwopqwop200/GPTQ-for-LLaMa: 4 bits quantization of LLaMa using GPTQ https://github.com/qwopqwop200/GPTQ-for-LLaMa 0 comments
[2210.17323] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers https://arxiv.org/abs/2210.17323 0 comments
[2305.16635] Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing https://arxiv.org/abs/2305.16635 0 comments
GitHub - mit-han-lab/llm-awq: AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration https://github.com/mit-han-lab/llm-awq 0 comments
GitHub - SqueezeAILab/SqueezeLLM: SqueezeLLM: Dense-and-Sparse Quantization https://github.com/SqueezeAILab/SqueezeLLM 0 comments
[2306.11695] A Simple and Effective Pruning Approach for Large Language Models https://arxiv.org/abs/2306.11695 0 comments
GitHub - mit-han-lab/smoothquant: [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models https://github.com/mit-han-lab/smoothquant 0 comments