Everything about Distributed Training and Efficient Finetuning

Linking pages

Normcore LLM Reads · GitHub https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e 54 comments
HN blogs - 8/10/23 - by Paul - HackerNews blogs newsletter https://open.substack.com/pub/hnblogs/p/hn-blogs-81023 0 comments

Linked pages

Mistral 7B | Mistral AI | Open source models https://mistral.ai/news/announcing-mistral-7b/ 618 comments
GitHub - ggerganov/llama.cpp: Port of Facebook's LLaMA model in C/C++ https://github.com/ggerganov/llama.cpp 286 comments
[2005.14165] Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 201 comments
[2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs https://arxiv.org/abs/2305.14314 129 comments
Google Gemini Eats The World – Gemini Smashes GPT-4 By 5X, The GPU-Poors https://www.semianalysis.com/p/google-gemini-eats-the-world-gemini 113 comments
vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention https://vllm.ai/ 42 comments
GitHub - stas00/ml-engineering: Machine Learning Engineering Online Book https://github.com/stas00/ml-engineering 37 comments
[2208.07339] LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale https://arxiv.org/abs/2208.07339 33 comments
How to Train Really Large Models on Many GPUs? | Lil'Log https://lilianweng.github.io/posts/2021-09-25-train-large/ 33 comments
Disqus – The #1 way to build your audience https://disqus.com 32 comments
Transformer Math 101 | EleutherAI Blog https://blog.eleuther.ai/transformer-math/ 13 comments
New – Amazon EC2 P5 Instances Powered by NVIDIA H100 Tensor Core GPUs for Accelerating Generative AI and HPC Applications | AWS News Blog https://aws.amazon.com/blogs/aws/new-amazon-ec2-p5-instances-powered-by-nvidia-h100-tensor-core-gpus-for-accelerating-generative-ai-and-hpc-applications/ 9 comments
[2106.09685] LoRA: Low-Rank Adaptation of Large Language Models https://arxiv.org/abs/2106.09685 8 comments
GitHub - lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. https://github.com/lm-sys/FastChat 4 comments
cerebras/btlm-3b-8k-base · Hugging Face https://huggingface.co/cerebras/btlm-3b-8k-base 2 comments
[1710.03740] Mixed Precision Training https://arxiv.org/abs/1710.03740 1 comment
[2101.06840] ZeRO-Offload: Democratizing Billion-Scale Model Training https://arxiv.org/abs/2101.06840 1 comment
bfloat16 floating-point format - Wikipedia https://en.wikipedia.org/wiki/Bfloat16_floating-point_format 1 comment
GitHub - OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions https://github.com/OpenAccess-AI-Collective/axolotl 1 comment
ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters - Microsoft Research https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/ 0 comments