GitHub - vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMs

Linking pages

GitHub - deepseek-ai/DeepSeek-R1 https://github.com/deepseek-ai/DeepSeek-R1 663 comments
Mistral 7B | Mistral AI | Open source models https://mistral.ai/news/announcing-mistral-7b/ 618 comments
Building a fully local AI smart home assistant | John's Website https://johnthenerd.com/blog/local-llm-assistant/ 186 comments
2:4 Sparse Llama: Smaller Models for Efficient GPU Inference https://neuralmagic.com/blog/24-sparse-llama-smaller-models-for-efficient-gpu-inference/ 132 comments
Hello Qwen2 | Qwen https://qwenlm.github.io/blog/qwen2/ 130 comments
Qwen2.5-1M: Deploy Your Own Qwen with Context Length up to 1M Tokens | Qwen https://qwenlm.github.io/blog/qwen2.5-1m/ 108 comments
Accelerating Generative AI with PyTorch II: GPT, Fast | PyTorch https://pytorch.org/blog/accelerating-generative-ai-2/ 69 comments
open-infra-index/OpenSourcing_DeepSeek_Inference_Engine at main · deepseek-ai/open-infra-index · GitHub https://github.com/deepseek-ai/open-infra-index/tree/main/OpenSourcing_DeepSeek_Inference_Engine 63 comments
GitHub - jzhang38/TinyLlama https://github.com/jzhang38/TinyLlama 60 comments
vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention https://vllm.ai/ 42 comments
What I learned from looking at 900 most popular open source AI tools https://huyenchip.com/2024/03/14/ai-oss.html 41 comments
Qwen2.5: A Party of Foundation Models! | Qwen https://qwenlm.github.io/blog/qwen2.5/ 38 comments
GitHub - microsoft/aici: AICI: Prompts as (Wasm) Programs https://github.com/microsoft/aici 36 comments
GitHub - THUDM/LongWriter: LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs https://github.com/THUDM/LongWriter 29 comments
GitHub - punica-ai/punica: Serving multiple LoRA finetuned LLM as one https://github.com/punica-ai/punica 26 comments
GitHub - imoneoi/openchat: OpenChat: Advancing Open-source Language Models with Imperfect Data https://github.com/imoneoi/openchat 25 comments
Announcing Pixtral 12B | Mistral AI | Frontier AI in your hands https://mistral.ai/news/pixtral-12b/ 25 comments
GitHub - sail-sg/understand-r1-zero: Understanding R1-Zero-Like Training: A Critical Perspective https://github.com/sail-sg/understand-r1-zero 21 comments
GitHub - S-LoRA/S-LoRA: S-LoRA: Serving Thousands of Concurrent LoRA Adapters https://github.com/S-LoRA/S-LoRA 20 comments
Announcing the llm-d community! | llm-d https://llm-d.ai/blog/llm-d-announce 15 comments

Linking pages

Linked pages