vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention - discu.eu

Hacker News

vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention https://vllm.ai/ 42 comments 20/6/2023

Linking pages

Why Open Source AI Will Win - by Varun - Public Experiments https://varunshenoy.substack.com/p/why-open-source-ai-will-win 174 comments
Serving LLM 24x Faster On the Cloud with vLLM and SkyPilot | SkyPilot Blog https://blog.skypilot.co/serving-llm-24x-faster-on-the-cloud-with-vllm-and-skypilot/ 1 comment
Everything about Distributed Training and Efficient Finetuning | Sumanth's Personal Website https://sumanthrh.com/post/distributed-and-efficient-finetuning/ 1 comment
GitHub - bentoml/BentoVLLM: Self-host LLMs with vLLM and BentoML https://github.com/bentoml/BentoVLLM 1 comment
The Easiest Part of LLM Applications is the LLM https://generatingconversation.substack.com/p/the-easiest-part-of-llm-applications 0 comments
High throughput LLM inference with vLLM and AMD: Achieving LLM inference parity with Nvidia https://embeddedllm.com/blog/vllm_rocm/ 0 comments
Understanding how LLM inference works with llama.cpp https://www.omrimallis.com/posts/understanding-how-llm-inference-works-with-llama-cpp/ 0 comments
Introducing SkyServe: 50% Cheaper AI Serving on Any Cloud with High Availability | SkyPilot Blog https://blog.skypilot.co/introducing-sky-serve/ 0 comments
Welcome to vLLM! — vLLM https://docs.vllm.ai/en/latest/ 0 comments
GitHub - xingyaoww/code-act: Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji. https://github.com/xingyaoww/code-act 0 comments

Linked pages

Related searches:

Search whole site: site:vllm.ai

Search title: vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention

See how to search.

Submit link to: