- [N] EleutherAI announces a 20 billion parameter model, GPT-NeoX-20B, with weights being publicly released next week https://github.com/EleutherAI/gpt-neox 67 comments machinelearning
Linking pages
- GitHub - BlinkDL/RWKV-LM: RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding. https://github.com/BlinkDL/RWKV-LM 179 comments
- GitHub - kingoflolz/mesh-transformer-jax: Model parallel transformers in JAX and Haiku https://github.com/kingoflolz/mesh-transformer-jax 146 comments
- GitHub - EleutherAI/gpt-neo: An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library. https://github.com/EleutherAI/gpt-neo/ 127 comments
- Announcing GPT-NeoX-20B | EleutherAI Blog https://blog.eleuther.ai/announcing-20b/ 70 comments
- RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious https://www.latent.space/p/rwkv#%C2%A7the-eleuther-mafia 66 comments
- Transformers for software engineers - Made of Bugs https://blog.nelhage.com/post/transformers-for-software-engineers/ 20 comments
- GitHub - mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. https://github.com/mlabonne/llm-course 10 comments
- GitHub - labmlai/neox: Simple Annotated implementation of GPT-NeoX in PyTorch https://github.com/labmlai/neox 8 comments
- GitHub - taishi-i/awesome-ChatGPT-repositories: A curated list of resources dedicated to open source GitHub repositories related to ChatGPT https://github.com/taishi-i/awesome-ChatGPT-repositories 5 comments
- GitHub - tensorchord/Awesome-LLMOps: An awesome & curated list of best LLMOps tools for developers https://github.com/tensorchord/Awesome-LLMOps 5 comments
- Conditional Text Generation by Fine Tuning Gretel GPT https://gretel.ai/blog/conditional-text-generation-by-fine-tuning-gretel-gpt 4 comments
- GitHub - Haiyang-W/TokenFormer: Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters https://github.com/Haiyang-W/TokenFormer 2 comments
- GitHub - microsoft/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. https://github.com/microsoft/DeepSpeed 1 comment
- Rotary Embeddings: A Relative Revolution | EleutherAI Blog https://blog.eleuther.ai/rotary-embeddings/ 1 comment
- GitHub - HazyResearch/aisys-building-blocks: Building blocks for foundation models. https://github.com/HazyResearch/aisys-building-blocks 1 comment
- GitHub - amrzv/awesome-colab-notebooks: Collection of google colaboratory notebooks for fast and easy experiments https://github.com/amrzv/awesome-colab-notebooks 0 comments
- Latest News - DeepSpeed https://www.deepspeed.ai/ 0 comments
- EleutherAI Open-Sources 20 Billion Parameter AI Language Model GPT-NeoX-20B https://www.infoq.com/news/2022/04/eleutherai-gpt-neox/ 0 comments
- Transformer Taxonomy (the last lit review) | kipply's blog https://kipp.ly/blog/transformer-taxonomy/ 0 comments
- GitHub - EleutherAI/pythia https://github.com/EleutherAI/pythia 0 comments
Linked pages
- The Pile http://pile.eleuther.ai/ 294 comments
- Cloud Computing Services - Amazon Web Services (AWS) https://aws.amazon.com 280 comments
- Announcing StableCode — Stability AI https://stability.ai/blog/stablecode-llm-generative-ai-coding 107 comments
- JSON Lines https://jsonlines.org/ 88 comments
- [2101.00027] The Pile: An 800GB Dataset of Diverse Text for Language Modeling https://arxiv.org/abs/2101.00027 81 comments
- Stability AI https://stability.ai 69 comments
- Hugging Face – The AI community building the future. https://huggingface.co/ 57 comments
- GitHub - huggingface/tokenizers: 💥 Fast State-of-the-Art Tokenizers optimized for Research and Production https://github.com/huggingface/tokenizers 47 comments
- [2310.10631] Llemma: An Open Language Model For Mathematics https://arxiv.org/abs/2310.10631 46 comments
- Installation Guide — NVIDIA Cloud Native Technologies documentation https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html 31 comments
- https://www.olcf.ornl.gov/frontier/ 26 comments
- GitHub - huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. https://github.com/huggingface/transformers 26 comments
- Weights & Biases – Developer tools for ML https://wandb.ai/site 11 comments
- https://www.olcf.ornl.gov/summit/ 8 comments
- [2202.13169] A Systematic Evaluation of Large Language Models of Code https://arxiv.org/abs/2202.13169 7 comments
- [2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling https://arxiv.org/abs/2304.01373 7 comments
- GitHub - Stability-AI/StableLM: StableLM: Stability AI Language Models https://github.com/Stability-AI/StableLM 4 comments
- GitHub - HazyResearch/flash-attention: Fast and memory-efficient exact attention https://github.com/HazyResearch/flash-attention 3 comments
- CoreWeave â The GPU Cloud https://www.coreweave.com/ 2 comments
- GitHub - microsoft/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. https://github.com/microsoft/DeepSpeed 1 comment
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.