GitHub - HazyResearch/aisys-building-blocks: Building blocks for foundation models.

Linking pages

[AINews] The world's first fully autonomous AI Engineer • Buttondown https://buttondown.email/ainews/archive/ainews-the-worlds-first-fully-autonomous-ai/ 0 comments

Linked pages

GitHub - BlinkDL/RWKV-LM: RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding. https://github.com/BlinkDL/RWKV-LM 179 comments
[2305.13048] RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/abs/2305.13048 171 comments
[2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs https://arxiv.org/abs/2305.14314 129 comments
From Deep to Long Learning? · Hazy Research https://hazyresearch.stanford.edu/blog/2023-03-27-long-learning 124 comments
Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers https://www.together.ai/blog/stripedhyena-7b 72 comments
GitHub - EleutherAI/gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. https://github.com/EleutherAI/gpt-neox 67 comments
QuIP# https://cornell-relaxml.github.io/quip-sharp/ 59 comments
[2212.14052] Hungry Hungry Hippos: Towards Language Modeling with State Space Models https://arxiv.org/abs/2212.14052 54 comments
[2312.00752] Mamba: Linear-Time Sequence Modeling with Selective State Spaces https://arxiv.org/abs/2312.00752 42 comments
GitHub - srush/GPU-Puzzles: Solve puzzles. Learn CUDA. https://github.com/srush/GPU-Puzzles 40 comments
http://arxiv.org/abs/1410.5401 40 comments
[2112.05682] Self-attention Does Not Need $O(n^2)$ Memory https://arxiv.org/abs/2112.05682 37 comments
[2303.06865] High-throughput Generative Inference of Large Language Models with a Single GPU https://arxiv.org/abs/2303.06865 36 comments
https://arxiv.org/abs/2307.08621 36 comments
[1803.03635] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks https://arxiv.org/abs/1803.03635 32 comments
Batch computing and the coming age of AI systems · Hazy Research https://hazyresearch.stanford.edu/blog/2023-04-12-batch 32 comments
Monarch Mixer: Revisiting BERT, Without Attention or MLPs · Hazy Research https://hazyresearch.stanford.edu/blog/2023-07-25-m2-bert 32 comments
Making Deep Learning go Brrrr From First Principles https://horace.io/brrr_intro.html 20 comments
[2310.01889] Ring Attention with Blockwise Transformers for Near-Infinite Context https://arxiv.org/abs/2310.01889 20 comments
[2309.06180] Efficient Memory Management for Large Language Model Serving with PagedAttention https://arxiv.org/abs/2309.06180 16 comments