Linking pages
- The 10,000x Yolo Researcher Metagame — with Yi Tay of Reka https://www.latent.space/p/yitay 4 comments
- ICLR 2024 — Best Papers & Talks (Benchmarks, Reasoning & Agents) — ft. Graham Neubig, Aman Sanger, Moritz Hardt) https://www.latent.space/p/iclr-2024-benchmarks-agents 0 comments
- How To Hire AI Engineers - by Adam Wiggins and James Brady https://www.latent.space/p/hiring 0 comments
- State of the Art: Training >70B LLMs on 10,000 H100 clusters https://www.latent.space/p/llm-training-2024 0 comments
Linked pages
- Introducing Gemini 1.5, Google's next-generation AI model https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/ 715 comments
- GitHub - smol-ai/developer: smol developer that writes code for u https://github.com/smol-ai/developer 138 comments
- GitHub - 01-ai/Yi: A series of large language models trained from scratch by developers @01-ai https://github.com/01-ai/Yi 52 comments
- [2310.01889] Ring Attention with Blockwise Transformers for Near-Infinite Context https://arxiv.org/abs/2310.01889 20 comments
- [2106.09685] LoRA: Low-Rank Adaptation of Large Language Models https://arxiv.org/abs/2106.09685 8 comments
- Gradient https://gradient.ai/ 8 comments
- SlimPajama: A 627B token cleaned and deduplicated version of RedPajama - Cerebras https://www.cerebras.net/blog/slimpajama-a-627b-token-cleaned-and-deduplicated-version-of-redpajama 7 comments
- WebSim, WorldSim, and The Summer of Simulative AI — with Joscha Bach of Liquid AI, Karan Malhotra of Nous Research, Rob Haisfield of WebSim.ai https://www.latent.space/p/sim-ai 7 comments
- Rotary Embeddings: A Relative Revolution | EleutherAI Blog https://blog.eleuther.ai/rotary-embeddings/ 1 comment
- [2401.16380] Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling https://arxiv.org/abs/2401.16380 1 comment
- Emulating Humans with NSFW Chatbots - with Jesse Silver https://www.latent.space/p/nsfw-chatbots 1 comment
- Latent Space | swyx | Substack https://www.latent.space/ 0 comments
- MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML https://www.latent.space/p/mosaic-mpt-7b 0 comments
- GitHub - FanaHOVA/smol-podcaster: smol-podcaster is your autonomous podcast production intern 🐣 https://github.com/FanaHOVA/smol-podcaster 0 comments
- GitHub - gkamradt/LLMTest_NeedleInAHaystack: Doing simple retrieval from LLM models at various context lengths to measure accuracy https://github.com/gkamradt/LLMTest_NeedleInAHaystack 0 comments
- [2401.02954] DeepSeek LLM: Scaling Open-Source Language Models with Longtermism https://arxiv.org/abs/2401.02954 0 comments
- The Four Wars of the AI Stack (Dec 2023 Recap) https://www.latent.space/p/dec-2023 0 comments
- ICLR 2024 — Best Papers & Talks (ImageGen, Vision, Transformers, State Space Models) ft. Christian Szegedy, Ilya Sutskever https://www.latent.space/p/iclr-2024-recap 0 comments
Related searches:
Search whole site: site:latent.space
Search title: How to train a Million Context LLM — with Mark Huang of Gradient.ai
See how to search.