Linking pages
- Serve with vLLM - Outlines 〰️ https://outlines-dev.github.io/outlines/reference/vllm/ 3 comments
- Beat GPT-4o at Python by searching with 100 dumb LLaMAs | Modal Blog https://modal.com/blog/llama-human-eval 2 comments
- GitHub - stacklok/codegate: CodeGate: CodeGen Privacy and Security https://github.com/stacklok/codegate 1 comment
- Unbowed, Unbent, Unbroken – Decoder Only https://decoderonlyblog.wordpress.com/2024/04/19/unbowed-unbent-unbroken/ 0 comments
- Training with Big Data on Any Cloud | Tigris Object Storage https://www.tigrisdata.com/blog/training-any-cloud/ 0 comments
- Red Hat Completes Acquisition of Neural Magic to Fuel Optimized Generative AI Innovation Across the Hybrid Cloud https://www.redhat.com/en/about/press-releases/red-hat-completes-acquisition-neural-magic-fuel-optimized-generative-ai-innovation-across-hybrid-cloud 0 comments
- structured decoding, a guide for the impatient https://aarnphm.xyz/posts/structured-decoding 0 comments
- AI Developer Frameworks and the evolving AI infrastructure ecosystem https://www.thelis.org/blog/ai-dev-ecosystem 0 comments
Linked pages
- vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention https://vllm.ai/ 42 comments
- How continuous batching enables 23x throughput in LLM inference while reducing p50 latency | Anyscale https://www.anyscale.com/blog/continuous-batching-llm-inference 20 comments
- [2309.06180] Efficient Memory Management for Large Language Model Serving with PagedAttention https://arxiv.org/abs/2309.06180 16 comments
- [2306.00978] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration https://arxiv.org/abs/2306.00978 2 comments
- [2306.07629] SqueezeLLM: Dense-and-Sparse Quantization https://arxiv.org/abs/2306.07629 1 comment
- [2210.17323] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers https://arxiv.org/abs/2210.17323 0 comments
- GitHub - vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMs https://github.com/vllm-project/vllm 0 comments
Related searches:
Search whole site: site:docs.vllm.ai
Search title: Welcome to vLLM! — vLLM
See how to search.