Linking pages
- The Rise of the AI Engineer - by swyx & Alessio https://www.latent.space/p/ai-engineer 153 comments
- The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI https://www.latent.space/p/transformers-math#details 66 comments
- RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious https://www.latent.space/p/rwkv#%C2%A7the-eleuther-mafia 66 comments
- Code Interpreter == GPT 4.5 (w/ Simon Willison & Alex Volkov) https://www.latent.space/p/code-interpreter 4 comments
- The Winds of AI Winter - Latent Space https://www.latent.space/p/mar-jun-2024 2 comments
- AI Fundamentals: Datasets 101 - Latent Space https://www.latent.space/p/datasets-101 1 comment
- How to train a Million Context LLM — with Mark Huang of Gradient.ai https://www.latent.space/p/gradient 1 comment
- FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI https://www.latent.space/p/flashattention 0 comments
- State of the Art: Training >70B LLMs on 10,000 H100 clusters https://www.latent.space/p/llm-training-2024 0 comments
Linked pages
- [2212.14052] Hungry Hungry Hippos: Towards Language Modeling with State Space Models https://arxiv.org/abs/2212.14052 54 comments
- Homepage | Cerebras https://www.cerebras.net/ 53 comments
- [1803.03635] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks https://arxiv.org/abs/1803.03635 32 comments
- [2108.12409] Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation https://arxiv.org/abs/2108.12409 17 comments
- Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs https://www.mosaicml.com/blog/mpt-7b 11 comments
- [2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
- GitHub - NVIDIA/FasterTransformer: Transformer related optimization, including BERT, GPT https://github.com/NVIDIA/FasterTransformer/ 1 comment
- tatsu-lab/alpaca · Datasets at Hugging Face https://huggingface.co/datasets/tatsu-lab/alpaca 1 comment
- bfloat16 floating-point format - Wikipedia https://en.wikipedia.org/wiki/Bfloat16_floating-point_format 1 comment
- mosaicml/mpt-7b-storywriter · Hugging Face https://huggingface.co/mosaicml/mpt-7b-storywriter 0 comments
- GitHub - mosaicml/llm-foundry https://github.com/mosaicml/llm-foundry 0 comments
Related searches:
Search whole site: site:latent.space
Search title: MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML
See how to search.