Hacker News
- Break the Sequential Dependency of LLM Inference Using Lookahead Decoding https://lmsys.org/blog/2023-11-21-lookahead-decoding/ 2 comments
Linking pages
- Consistency Large Language Models: A Family of Efficient Parallel Decoders | Hao AI Lab @ UCSD https://hao-ai-lab.github.io/blogs/cllm/ 98 comments
- How to make LLMs go fast https://vgel.me/posts/faster-inference/ 54 comments
- We Are Running Out of Low-Background Tokens (Nov 2023 Recap) https://www.latent.space/i/139368545/the-concept-of-low-background-tokens 6 comments
- Transformer inference tricks - by Finbarr Timbers https://www.artfintel.com/p/transformer-inference-tricks 0 comments
Related searches:
Search whole site: site:lmsys.org
Search title: Break the Sequential Dependency of LLM Inference Using Lookahead Decoding | LMSYS Org
See how to search.