Hacker News
Linking pages
- $2 H100s: How the GPU Bubble Burst - by Eugene Cheah https://www.latent.space/p/gpu-bubble 289 comments
- The State of Silicon and the GPU Poors - with Dylan Patel of SemiAnalysis https://www.latent.space/p/semianalysis 40 comments
- The End of Finetuning — with Jeremy Howard of Fast.ai https://www.latent.space/p/fastai#details 5 comments
- Doing it the Hard Way: Making the AI engine and language 🔥 of the future — with Chris Lattner of Modular https://www.latent.space/p/modular 2 comments
- The Four Wars of the AI Stack (Dec 2023 Recap) https://www.latent.space/p/dec-2023 0 comments
- The Four Wars of the AI Stack (Dec 2023 Recap) https://www.latent.space/i/140396949/mixtral-sparks-a-gpuinference-war 0 comments
- Worthwhile Research for building SOTA LLMs (Jan 2024 Recap) https://www.latent.space/p/jan-2024 0 comments
Linked pages
- The Unreasonable Effectiveness of Recurrent Neural Networks https://karpathy.github.io/2015/05/21/rnn-effectiveness/ 434 comments
- The Pile http://pile.eleuther.ai/ 294 comments
- Edge AI Just Got Faster https://justine.lol/mmap/ 254 comments
- LLaMA2 isn't "Open Source" - and why it doesn't matter https://www.alessiofanelli.com/blog/llama2-isnt-open-source 228 comments
- GitHub - BlinkDL/RWKV-LM: RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding. https://github.com/BlinkDL/RWKV-LM 179 comments
- [2305.13048] RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/abs/2305.13048 171 comments
- Google Gemini Eats The World – Gemini Smashes GPT-4 By 5X, The GPU-Poors https://www.semianalysis.com/p/google-gemini-eats-the-world-gemini 113 comments
- Revealed: The Authors Whose Pirated Books Are Powering Generative AI - The Atlantic https://www.theatlantic.com/technology/archive/2023/08/books3-ai-meta-llama-pirated-books/675063/ 86 comments
- Open challenges in LLM research https://huyenchip.com/2023/08/16/llm-research-open-challenges.html 72 comments
- Stability AI https://stability.ai 69 comments
- Neural Networks: Zero To Hero https://karpathy.ai/zero-to-hero.html 69 comments
- GitHub - EleutherAI/gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. https://github.com/EleutherAI/gpt-neox 67 comments
- The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI https://www.latent.space/p/transformers-math#details 66 comments
- The RWKV language model: An RNN with the advantages of a transformer | The Good Minima https://johanwind.github.io/2023/03/23/rwkv_overview.html 45 comments
- https://arxiv.org/abs/2307.08621 36 comments
- Monarch Mixer: Revisiting BERT, Without Attention or MLPs · Hazy Research https://hazyresearch.stanford.edu/blog/2023-07-25-m2-bert 32 comments
- [2105.14103] An Attention Free Transformer https://arxiv.org/abs/2105.14103 15 comments
- Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality | LMSYS Org https://lmsys.org/blog/2023-03-30-vicuna/ 7 comments
- How the RWKV language model works | The Good Minima https://johanwind.github.io/2023/03/23/rwkv_details.html 6 comments
- UI-licious: Flexible & Intuitive Automated Web Testing Tool https://uilicious.com 5 comments
Related searches:
Search whole site: site:www.latent.space
Search title: RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious
See how to search.