Rotary Embeddings: A Relative Revolution | EleutherAI Blog

Linking pages

A Picture is Worth 170 Tokens: How Does GPT-4o Encode Images? - OranLooney.com https://www.oranlooney.com/post/gpt-cnn/ 112 comments
GPT-J-6B: 6B JAX-Based Transformer – Aran Komatsuzaki https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/ 79 comments
You could have designed state of the art positional encoding https://fleetwood.dev/posts/you-could-have-designed-SOTA-positional-encoding 47 comments
Transformers for software engineers - Made of Bugs https://blog.nelhage.com/post/transformers-for-software-engineers/ 20 comments
Meta quietly releases Llama 2 Long AI model | VentureBeat https://venturebeat.com/ai/meta-quietly-releases-llama-2-long-ai-that-outperforms-gpt-3-5-and-claude-2-on-some-tasks/ 12 comments
How to convert the SalesForce CodeGen models to GPT-J · GitHub https://gist.github.com/moyix/7896575befbe1b99162ccfec8d135566 3 comments
GitHub - Const-me/Cgml: GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation. https://github.com/Const-me/Cgml 1 comment
How to train a Million Context LLM — with Mark Huang of Gradient.ai https://www.latent.space/p/gradient 1 comment
Gradient Update #1: FBI Usage of Facial Recognition and Rotary Embeddings For Large LM's https://thegradientpub.substack.com/p/update-1-fbi-usage-of-facial-recognition 0 comments
Transformer Taxonomy (the last lit review) | kipply's blog https://kipp.ly/blog/transformer-taxonomy/ 0 comments
LLaMA-2 from the Ground Up - by Cameron R. Wolfe, Ph.D. https://cameronrwolfe.substack.com/p/llama-2-from-the-ground-up 0 comments
Dolma, OLMo, and the Future of Open-Source LLMs https://cameronrwolfe.substack.com/p/dolma-olmo-and-the-future-of-open 0 comments
GitHub - likejazz/llama3.np: llama3.np is pure NumPy implementation for Llama 3 model. https://github.com/likejazz/llama3.np 0 comments

Linking pages

Linked pages