Hacker News
- X-Transformers: A fully-featured transformer with experimental features https://github.com/lucidrains/x-transformers 37 comments
- [D] unable to overfit transformer decoder model https://github.com/lucidrains/x-transformers#xval---continuous-and-discrete 3 comments machinelearning
Linking pages
- GitHub - lucidrains/imagen-pytorch: Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch https://github.com/lucidrains/imagen-pytorch 117 comments
- GitHub - borisdayma/dalle-mini: DALL·E Mini - Generate images from a text prompt https://github.com/borisdayma/dalle-mini 11 comments
- GitHub - neonsecret/stable-diffusion-webui: Stable Diffusion web UI (neonsecret fork) https://github.com/neonsecret/stable-diffusion-webui 8 comments
- GitHub - CompVis/latent-diffusion: High-Resolution Image Synthesis with Latent Diffusion Models https://github.com/CompVis/latent-diffusion 5 comments
- GitHub - CompVis/stable-diffusion: A latent text-to-image diffusion model https://github.com/CompVis/stable-diffusion 4 comments
- GitHub - lucidrains/vit-pytorch: Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch https://github.com/lucidrains/vit-pytorch#vision-transformer-for-small-datasets 3 comments
- GitHub - conceptofmind/CaiT-Flax https://github.com/conceptofmind/CaiT-Flax 1 comment
- Rotary Embeddings: A Relative Revolution | EleutherAI Blog https://blog.eleuther.ai/rotary-embeddings/ 1 comment
- GitHub - amrzv/awesome-colab-notebooks: Collection of google colaboratory notebooks for fast and easy experiments https://github.com/amrzv/awesome-colab-notebooks 0 comments
- GitHub - sd-webui/stable-diffusion-webui: Stable Diffusion web UI https://github.com/hlky/stable-diffusion-webui 0 comments
- GitHub - hlky/stable-diffusion https://github.com/hlky/stable-diffusion 0 comments
- GitHub - darkhemic/stable-diffusion-cpuonly: a fork that installs runs on pytorch cpu-only https://github.com/darkhemic/stable-diffusion-cpuonly 0 comments
- GitHub - Sygil-Dev/sygil-webui: Stable Diffusion web UI https://github.com/sd-webui/stable-diffusion-webui 0 comments
- ColossalAI/examples/images/diffusion at main · hpcaitech/ColossalAI · GitHub https://github.com/hpcaitech/ColossalAI/tree/main/examples/images/diffusion 0 comments
Linked pages
- GitHub - yandex/YaLM-100B: Pretrained language model with 100B parameters https://github.com/yandex/YaLM-100B 902 comments
- GitHub - deepmind/alphafold: Open source code for AlphaFold. https://github.com/deepmind/alphafold 315 comments
- Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance – Google AI Blog https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html 279 comments
- Introducing LLaMA: A foundational, 65-billion-parameter language model https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ 204 comments
- Releasing Persimmon-8B https://www.adept.ai/blog/persimmon-8b 56 comments
- [2112.05682] Self-attention Does Not Need $O(n^2)$ Memory https://arxiv.org/abs/2112.05682 37 comments
- [2212.14034] Cramming: Training a Language Model on a Single GPU in One Day https://arxiv.org/abs/2212.14034 25 comments
- [2307.14995] Scaling TransNormer to 175 Billion Parameters https://arxiv.org/abs/2307.14995 22 comments
- [2109.08668] Primer: Searching for Efficient Transformers for Language Modeling https://arxiv.org/abs/2109.08668 18 comments
- Improving language models by retrieving from trillions of tokens https://deepmind.com/research/publications/2021/improving-language-models-by-retrieving-from-trillions-of-tokens 6 comments
- [2105.13290] CogView: Mastering Text-to-Image Generation via Transformers https://arxiv.org/abs/2105.13290 6 comments
- bigscience/bloom · Hugging Face https://huggingface.co/bigscience/bloom 4 comments
- [2305.19466] The Impact of Positional Encoding on Length Generalization in Transformers https://arxiv.org/abs/2305.19466 4 comments
- [2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
- GitHub - HazyResearch/flash-attention: Fast and memory-efficient exact attention https://github.com/HazyResearch/flash-attention 3 comments
- [2305.19268] Intriguing Properties of Quantization at Scale https://arxiv.org/abs/2305.19268 2 comments
- [1911.02150] Fast Transformer Decoding: One Write-Head is All You Need https://arxiv.org/abs/1911.02150 1 comment
- [1910.10683] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer https://arxiv.org/abs/1910.10683 1 comment
- [2204.02311] PaLM: Scaling Language Modeling with Pathways https://arxiv.org/abs/2204.02311 0 comments
- [1910.05895] Transformers without Tears: Improving the Normalization of Self-Attention https://arxiv.org/abs/1910.05895 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.