Linking pages
- Mistral 7B | Mistral AI | Open source models https://mistral.ai/news/announcing-mistral-7b/ 618 comments
- Building a fully local AI smart home assistant | John's Website https://johnthenerd.com/blog/local-llm-assistant/ 186 comments
- 2:4 Sparse Llama: Smaller Models for Efficient GPU Inference https://neuralmagic.com/blog/24-sparse-llama-smaller-models-for-efficient-gpu-inference/ 132 comments
- Hello Qwen2 | Qwen https://qwenlm.github.io/blog/qwen2/ 130 comments
- Accelerating Generative AI with PyTorch II: GPT, Fast | PyTorch https://pytorch.org/blog/accelerating-generative-ai-2/ 69 comments
- GitHub - jzhang38/TinyLlama https://github.com/jzhang38/TinyLlama 60 comments
- vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention https://vllm.ai/ 42 comments
- What I learned from looking at 900 most popular open source AI tools https://huyenchip.com/2024/03/14/ai-oss.html 40 comments
- Qwen2.5: A Party of Foundation Models! | Qwen https://qwenlm.github.io/blog/qwen2.5/ 38 comments
- GitHub - microsoft/aici: AICI: Prompts as (Wasm) Programs https://github.com/microsoft/aici 36 comments
- GitHub - THUDM/LongWriter: LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs https://github.com/THUDM/LongWriter 29 comments
- GitHub - punica-ai/punica: Serving multiple LoRA finetuned LLM as one https://github.com/punica-ai/punica 26 comments
- GitHub - imoneoi/openchat: OpenChat: Advancing Open-source Language Models with Imperfect Data https://github.com/imoneoi/openchat 25 comments
- Announcing Pixtral 12B | Mistral AI | Frontier AI in your hands https://mistral.ai/news/pixtral-12b/ 25 comments
- GitHub - S-LoRA/S-LoRA: S-LoRA: Serving Thousands of Concurrent LoRA Adapters https://github.com/S-LoRA/S-LoRA 20 comments
- AI on Linux: A Collection of AI Models, LLMs and Chatbots for Linux https://linuxblog.io/ai-on-linux-a-collection-of-ai-models-llms-and-chatbots-for-linux/ 10 comments
- How Can I Be An AI Engineer? - Tim Kellogg https://timkellogg.me/blog/2024/12/09/ai-engineer 8 comments
- GitHub - shreyansh26/LLM-Sampling: A collection of various LLM sampling methods implemented in pure Pytorch https://github.com/shreyansh26/LLM-Sampling 7 comments
- Snowflake Arctic - LLM for Enterprise AI https://www.snowflake.com/blog/arctic-open-efficient-foundation-language-models-snowflake/ 6 comments
- GitHub - tensorchord/Awesome-LLMOps: An awesome & curated list of best LLMOps tools for developers https://github.com/tensorchord/Awesome-LLMOps 5 comments
Linked pages
- Supporting the Open Source AI Community | Andreessen Horowitz https://a16z.com/2023/08/30/supporting-the-open-source-ai-community/ 110 comments
- https://chat.lmsys.org/ 51 comments
- vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention https://vllm.ai/ 42 comments
- [2309.06180] Efficient Memory Management for Large Language Model Serving with PagedAttention https://arxiv.org/abs/2309.06180 16 comments
- [2306.00978] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration https://arxiv.org/abs/2306.00978 2 comments
- [2306.07629] SqueezeLLM: Dense-and-Sparse Quantization https://arxiv.org/abs/2306.07629 1 comment
- [2210.17323] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers https://arxiv.org/abs/2210.17323 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMs
See how to search.