Hacker News
- Consistency LLM: converting LLMs to parallel decoders accelerates inference 3.5x https://hao-ai-lab.github.io/blogs/cllm/ 97 comments
Linking pages
Linked pages
- Break the Sequential Dependency of LLM Inference Using Lookahead Decoding | LMSYS Org https://lmsys.org/blog/2023-11-21-lookahead-decoding/ 2 comments
- [2306.13649] GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models https://arxiv.org/abs/2306.13649#deepmind 1 comment
- [2303.01469] Consistency Models https://arxiv.org/abs/2303.01469 0 comments
Related searches:
Search whole site: site:hao-ai-lab.github.io
Search title: Consistency Large Language Models: A Family of Efficient Parallel Decoders | Hao AI Lab @ UCSD
See how to search.