ICLR 2024 — Best Papers & Talks (ImageGen, Vision, Transformers, State Space Models) ft. Christian Szegedy, Ilya Sutskever

Linking pages

The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka https://www.latent.space/p/yitay 4 comments
How to train a Million Context LLM — with Mark Huang of Gradient.ai https://www.latent.space/p/gradient 1 comment
ICLR 2024 — Best Papers & Talks (Benchmarks, Reasoning & Agents) — ft. Graham Neubig, Aman Sanger, Moritz Hardt) https://www.latent.space/p/iclr-2024-benchmarks-agents 0 comments
How To Hire AI Engineers - by Adam Wiggins and James Brady https://www.latent.space/p/hiring 0 comments
State of the Art: Training >70B LLMs on 10,000 H100 clusters https://www.latent.space/p/llm-training-2024 0 comments

Linked pages

[2310.02226] Think before you speak: Training Language Models With Pause Tokens https://arxiv.org/abs/2310.02226 107 comments
[2306.11644] Textbooks Are All You Need https://arxiv.org/abs/2306.11644 106 comments
[2402.13753] LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens https://arxiv.org/abs/2402.13753 46 comments
Why Google failed to make GPT-3 + why Multimodality for Knowledge Work is the path to AGI - with David Luan of Adept https://www.latent.space/p/adept 40 comments
[2404.07143] Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention https://arxiv.org/abs/2404.07143 40 comments
[2310.01889] Ring Attention with Blockwise Transformers for Near-Infinite Context https://arxiv.org/abs/2310.01889 20 comments
ICLR 2023 https://iclr.cc 10 comments
[2309.16588] Vision Transformers Need Registers https://arxiv.org/abs/2309.16588 9 comments
WebSim, WorldSim, and The Summer of Simulative AI — with Joscha Bach of Liquid AI, Karan Malhotra of Nous Research, Rob Haisfield of WebSim.ai https://www.latent.space/p/sim-ai 7 comments
[1312.6114] Auto-Encoding Variational Bayes https://arxiv.org/abs/1312.6114 4 comments
Things I’m Learning While Training SuperHOT | kaiokendev.github.io https://kaiokendev.github.io/til 2 comments
[2401.12945] Lumiere: A Space-Time Diffusion Model for Video Generation https://arxiv.org/abs/2401.12945 2 comments
[2309.10668] Language Modeling Is Compression https://arxiv.org/abs/2309.10668 1 comment
Emulating Humans with NSFW Chatbots - with Jesse Silver https://www.latent.space/p/nsfw-chatbots 1 comment
[1312.6199] Intriguing properties of neural networks http://arxiv.org/abs/1312.6199 0 comments
Latent Space | swyx | Substack https://www.latent.space/ 0 comments
[2310.00785] BooookScore: A systematic exploration of book-length summarization in the era of LLMs https://arxiv.org/abs/2310.00785 0 comments
NeurIPS 2023 Recap — Best Papers - by swyx - Latent Space https://www.latent.space/p/neurips-2023-papers 0 comments
ICLR 2024 Outstanding Paper Awards – ICLR Blog https://blog.iclr.cc/2024/05/06/iclr-2024-outstanding-paper-awards/ 0 comments