GitHub - EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of autoregressive language models.

Linking pages

GitHub - openlm-research/open_llama: OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset https://github.com/openlm-research/open_llama 183 comments
GitHub - kingoflolz/mesh-transformer-jax: Model parallel transformers in JAX and Haiku https://github.com/kingoflolz/mesh-transformer-jax 146 comments
2:4 Sparse Llama: Smaller Models for Efficient GPU Inference https://neuralmagic.com/blog/24-sparse-llama-smaller-models-for-efficient-gpu-inference/ 132 comments
GitHub - EleutherAI/gpt-neo: An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library. https://github.com/EleutherAI/gpt-neo/ 127 comments
GitHub - EleutherAI/gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. https://github.com/EleutherAI/gpt-neox 67 comments
Normcore LLM Reads · GitHub https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e 54 comments
GitHub - pytorch/torchchat: Run PyTorch LLMs locally on servers, desktop and mobile https://github.com/pytorch/torchchat 41 comments
Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments - Lightning AI https://lightning.ai/pages/community/lora-insights/ 39 comments
Pile-T5 | EleutherAI Blog https://blog.eleuther.ai/pile-t5/ 15 comments
GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
🦅 EagleX 1.7T : Soaring past LLaMA 7B 2T in both English and Multi-lang evals (RWKV-v5) https://substack.recursal.ai/p/eaglex-17t-soaring-past-llama-7b 9 comments
All about evaluating Large language models https://explodinggradients.com/all-about-evaluating-large-language-models 8 comments
GitHub - eth-sri/language-model-arithmetic: Controlled Text Generation via Language Model Arithmetic https://github.com/eth-sri/language-model-arithmetic 8 comments
GitHub - jquesnelle/yarn: YaRN: Efficient Context Window Extension of Large Language Models https://github.com/jquesnelle/yarn 5 comments
Meltemi: A Large Language Model for Greek | by Leonvouk | Institute for Language and Speech Processing / Athena RC | Mar, 2024 | Medium https://medium.com/institute-for-language-and-speech-processing/meltemi-a-large-language-model-for-greek-9f5ef1d4a10f 5 comments
Is GPT-3 still King? Introducing GPT-J-6B https://ooshimus.com/is-gpt-3-still-king-introducing-gpt-j-6b 4 comments
Code Interpreter == GPT 4.5 (w/ Simon Willison & Alex Volkov) https://www.latent.space/p/code-interpreter 4 comments
GitHub - simplescaling/s1: s1: Simple test-time scaling https://github.com/simplescaling/s1 3 comments
GitHub - Haiyang-W/TokenFormer: Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters https://github.com/Haiyang-W/TokenFormer 2 comments
GitHub - mbzuai-nlp/LaMini-LM: LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions https://github.com/mbzuai-nlp/LaMini-LM 1 comment

Linking pages

Linked pages