Hacker News
- FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention https://pytorch.org/blog/flexattention/ 24 comments
Linking pages
- GitHub - Ligo-Biosciences/AlphaFold3: Open source implementation of AlphaFold3 https://github.com/Ligo-Biosciences/AlphaFold3 37 comments
- CUDA-Free Inference for LLMs | PyTorch https://pytorch.org/blog/cuda-free-inference-for-llms/ 0 comments
- PyTorch 2.5 Release Blog | PyTorch https://pytorch.org/blog/pytorch2-5/ 0 comments
Linked pages
- Mistral 7B | Mistral AI | Open source models https://mistral.ai/news/announcing-mistral-7b/ 618 comments
- [2310.06825] Mistral 7B https://arxiv.org/abs/2310.06825 124 comments
- [2108.12409] Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation https://arxiv.org/abs/2108.12409 17 comments
- [2309.06180] Efficient Memory Management for Large Language Model Serving with PagedAttention https://arxiv.org/abs/2309.06180 16 comments
- [1910.10683] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer https://arxiv.org/abs/1910.10683 1 comment
- Fast and Expressive LLM Inference with RadixAttention and SGLang | LMSYS Org https://lmsys.org/blog/2024-01-17-sglang/ 0 comments
- [2407.07726] PaliGemma: A versatile 3B VLM for transfer https://arxiv.org/abs/2407.07726 0 comments
Related searches:
Search whole site: site:pytorch.org
Search title: FlexAttention: The Flexibility of PyTorch with the Performance of FlashAttention | PyTorch
See how to search.