Linking pages
- My AI Timelines Have Sped Up (Again) https://www.alexirpan.com/2024/01/10/ai-timelines-2024.html 95 comments
- GitHub - jzhang38/TinyLlama https://github.com/jzhang38/TinyLlama 60 comments
- GitHub - 01-ai/Yi: A series of large language models trained from scratch by developers @01-ai https://github.com/01-ai/Yi 52 comments
- GitHub - QwenLM/Qwen: The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud. https://github.com/QwenLM/Qwen 51 comments
- GitHub - tspeterkim/flash-attention-minimal: Flash Attention in ~100 lines of CUDA (forward pass only) https://github.com/tspeterkim/flash-attention-minimal 41 comments
- GitHub - THUDM/LongWriter: LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs https://github.com/THUDM/LongWriter 29 comments
- GitHub - linkedin/Liger-Kernel: Efficient Triton Kernels for LLM Training https://github.com/linkedin/Liger-Kernel 19 comments
- Medical large language models are vulnerable to data-poisoning attacks | Nature Medicine https://www.nature.com/articles/s41591-024-03445-1 7 comments
- Llemma: An Open Language Model For Mathematics | EleutherAI Blog https://blog.eleuther.ai/llemma/ 6 comments
- GitHub - pjlab-sys4nlp/llama-moe: ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training https://github.com/pjlab-sys4nlp/llama-moe 6 comments
- distributed-training-guide/06-training-llama-405b at main · LambdaLabsML/distributed-training-guide · GitHub https://github.com/LambdaLabsML/distributed-training-guide/tree/main/06-training-llama-405b 4 comments
- GitHub - Alpha-VLLM/LLaMA2-Accessory: An Open-source Toolkit for LLM Development https://github.com/Alpha-VLLM/LLaMA2-Accessory 3 comments
- GitHub - QwenLM/Qwen-7B: The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud. https://github.com/QwenLM/Qwen-7B 1 comment
- ALiBi FlashAttention - Speeding up ALiBi by 3-5x with a hardware-efficient implementation | Princeton Language and Intelligence https://pli.princeton.edu/blog/2024/alibi-flashattention-speeding-alibi-3-5x-hardware-efficient-implementation 1 comment
- GitHub - GreenBitAI/green-bit-llm: A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs. https://github.com/GreenBitAI/green-bit-llm 1 comment
- Subclassing wheel builds for fun and profit | Pierce Freeman https://freeman.vc/notes/subclassing-wheel-builds-for-fun-and-profit 0 comments
- Accelerating Generative AI with PyTorch: Segment Anything, Fast | PyTorch https://pytorch.org/blog/accelerating-generative-ai/ 0 comments
- GitHub - AIoT-MLSys-Lab/Efficient-LLMs-Survey: Efficient Large Language Models: A Survey https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey 0 comments
- GitHub - RichardKelley/dendron: A library for building software agents using behavior trees and language models. https://github.com/RichardKelley/dendron 0 comments
- unilm/kosmos-2.5 at master · microsoft/unilm · GitHub https://github.com/microsoft/unilm/tree/master/kosmos-2.5 0 comments
Linked pages
- Mistral 7B | Mistral AI | Open source models https://mistral.ai/news/announcing-mistral-7b/ 618 comments
- We’re Training AI Twice as Fast This Year as Last - IEEE Spectrum https://spectrum.ieee.org/mlperf-rankings-2022 35 comments
- [2309.06180] Efficient Memory Management for Large Language Model Serving with PagedAttention https://arxiv.org/abs/2309.06180 16 comments
- [2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
- Mistral AI | Open source models https://mistral.ai/ 1 comment
- FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision | Tri Dao https://tridao.me/blog/2024/flash3/ 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - Dao-AILab/flash-attention: Fast and memory-efficient exact attention
See how to search.