Hacker News
Linking pages
- RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious https://www.latent.space/p/rwkv#%C2%A7the-eleuther-mafia 66 comments
- RAG is a hack - with Jerry Liu from LlamaIndex https://www.latent.space/p/llamaindex#details 36 comments
- Enabling "Maximum Enterprise Utilization" with AI https://www.alessiofanelli.com/posts/maximum-enterprise-utilization 0 comments
- RLHF 201 - with Nathan Lambert of AI2 and Interconnects https://www.latent.space/p/rlhf-201 0 comments
Linked pages
- [2305.13048] RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/abs/2305.13048 171 comments
- The Rise of the AI Engineer - by swyx & Alessio https://www.latent.space/p/ai-engineer 153 comments
- Turing-NLG: A 17-billion-parameter language model by Microsoft - Microsoft Research https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/ 139 comments
- Making Deep Learning go Brrrr From First Principles https://horace.io/brrr_intro.html 20 comments
- Transformer Math 101 | EleutherAI Blog https://blog.eleuther.ai/transformer-math/ 13 comments
- https://www.olcf.ornl.gov/summit/ 8 comments
- BLOOM https://huggingface.co/docs/transformers/model_doc/bloom 3 comments
- Frontier supercomputer debuts as world’s fastest, breaking exascale barrier | ORNL https://www.ornl.gov/news/frontier-supercomputer-debuts-worlds-fastest-breaking-exascale-barrier 0 comments
- EleutherAI https://www.eleuther.ai/ 0 comments
- MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML https://www.latent.space/p/mosaic-mpt-7b 0 comments
- FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI https://www.latent.space/p/flashattention 0 comments
Related searches:
Search whole site: site:www.latent.space
Search title: The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI
See how to search.