Linking pages
- GitHub - QwenLM/Qwen: The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud. https://github.com/QwenLM/Qwen 51 comments
- GitHub - oobabooga/text-generation-webui: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models. https://github.com/oobabooga/text-generation-webui 41 comments
- GitHub - toverainc/willow-inference-server: Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS https://github.com/toverainc/willow-inference-server 13 comments
- HQQ quantization https://mobiusml.github.io/hqq_blog/ 2 comments
- Announcing GPTQ & GGML Quantized LLM support for Huggingface Transformers â PostgresML https://postgresml.org/blog/announcing-gptq-and-ggml-quantized-llm-support-for-huggingface-transformers 1 comment
- GitHub - turboderp/exllama: A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights. https://github.com/turboderp/exllama 0 comments
- A practical guide to deploying Large Language Models Cheap, Good *and* Fast https://askdala.substack.com/p/a-pratical-guide-to-deploying-llms 0 comments
- GitHub - EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of autoregressive language models. https://github.com/EleutherAI/lm-evaluation-harness 0 comments
Linked pages
- GitHub Star History https://star-history.com/#microsoft/playwright&cypress-io/cypress&Date 78 comments
- GitHub - huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. https://github.com/huggingface/transformers 26 comments
- GitHub - qwopqwop200/GPTQ-for-LLaMa: 4 bits quantization of LLaMa using GPTQ https://github.com/qwopqwop200/GPTQ-for-LLaMa 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - PanQiWei/AutoGPTQ: An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
See how to search.