Linking pages
Linked pages
- [2306.11644] Textbooks Are All You Need https://arxiv.org/abs/2306.11644 106 comments
- [2307.02486] LongNet: Scaling Transformers to 1,000,000,000 Tokens https://arxiv.org/abs/2307.02486 98 comments
- [2307.01850] Self-Consuming Generative Models Go MAD https://arxiv.org/abs/2307.01850 53 comments
- [2104.09864] RoFormer: Enhanced Transformer with Rotary Position Embedding https://arxiv.org/abs/2104.09864 8 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- [2302.10866] Hyena Hierarchy: Towards Larger Convolutional Language Models https://arxiv.org/abs/2302.10866 3 comments
- AI Research Highlights In 3 Sentences Or Less (May-June 2023) https://magazine.sebastianraschka.com/p/ai-research-highlights-in-3-sentences-2a1 3 comments
- [2306.15794] HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution https://arxiv.org/abs/2306.15794 3 comments
- [2306.11987] Training Transformers with 4-bit Integers https://arxiv.org/abs/2306.11987 2 comments
- [2306.15595] Extending Context Window of Large Language Models via Positional Interpolation https://arxiv.org/abs/2306.15595 1 comment
- [2306.11695] A Simple and Effective Pruning Approach for Large Language Models https://arxiv.org/abs/2306.11695 0 comments
- [2307.03172] Lost in the Middle: How Language Models Use Long Contexts https://arxiv.org/abs/2307.03172 0 comments
- [2306.17563] Large Language Models are Effective Text Rankers with Pairwise Ranking Prompting https://arxiv.org/abs/2306.17563 0 comments
- [2307.01952] SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis https://arxiv.org/abs/2307.01952 0 comments
Related searches:
Search whole site: site:magazine.sebastianraschka.com
Search title: AI Research Highlights In 3 Sentences Or Less (June -July 2023)
See how to search.