distributed-training-guide/06-training-llama-405b at main · LambdaLabsML/distributed-training-guide · GitHub - discu.eu

Hacker News

Show HN: How to guide on training Llama-405B using PyTorch distributed APIs https://github.com/LambdaLabsML/distributed-training-guide/tree/main/06-training-llama-405b 4 comments 15/10/2024

Linked pages

[2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
[1604.06174] Training Deep Nets with Sublinear Memory Cost https://arxiv.org/abs/1604.06174 0 comments
GitHub - Dao-AILab/flash-attention: Fast and memory-efficient exact attention https://github.com/Dao-AILab/flash-attention 0 comments

Related searches:

Search whole site: site:github.com

Search title: distributed-training-guide/06-training-llama-405b at main · LambdaLabsML/distributed-training-guide · GitHub

See how to search.

Submit link to: