GitHub - PanQiWei/AutoGPTQ: An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm. - discu.eu

Linking pages

GitHub - QwenLM/Qwen: The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud. https://github.com/QwenLM/Qwen 51 comments
GitHub - oobabooga/text-generation-webui: A Gradio web UI for Large Language Models. https://github.com/oobabooga/text-generation-webui 41 comments
GitHub - toverainc/willow-inference-server: Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS https://github.com/toverainc/willow-inference-server 13 comments
HQQ quantization https://mobiusml.github.io/hqq_blog/ 2 comments
Announcing GPTQ & GGML Quantized LLM support for Huggingface Transformers â PostgresML https://postgresml.org/blog/announcing-gptq-and-ggml-quantized-llm-support-for-huggingface-transformers 1 comment
GitHub - turboderp/exllama: A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights. https://github.com/turboderp/exllama 0 comments
A practical guide to deploying Large Language Models Cheap, Good *and* Fast https://askdala.substack.com/p/a-pratical-guide-to-deploying-llms 0 comments
GitHub - EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of autoregressive language models. https://github.com/EleutherAI/lm-evaluation-harness 0 comments

Linked pages

Related searches:

Search whole site: site:github.com

Search title: GitHub - PanQiWei/AutoGPTQ: An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

See how to search.

Submit link to: