StreamingLLM: Efficient streaming technique enable infinite sequence lengths - discu.eu

Hacker News

StreamingLLM: Efficient streaming technique enable infinite sequence lengths https://arxiv.org/abs/2309.17453 12 comments 3/10/2023

Linking pages

GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks https://github.com/mit-han-lab/streaming-llm 65 comments
GitHub - HazyResearch/aisys-building-blocks: Building blocks for foundation models. https://github.com/HazyResearch/aisys-building-blocks 1 comment
GitHub - intel/intel-extension-for-transformers: ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡ https://github.com/intel/intel-extension-for-transformers 0 comments
GitHub - AIoT-MLSys-Lab/Efficient-LLMs-Survey: Efficient Large Language Models: A Survey https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey 0 comments

Related searches:

Search whole site: site:arxiv.org

Search title: StreamingLLM: Efficient streaming technique enable infinite sequence lengths

See how to search.

Submit link to: