AI Research Highlights In 3 Sentences Or Less (May-June 2023) - discu.eu

Reddit

[P] Research Paper Highlights from May to June 2023 https://magazine.sebastianraschka.com/p/ai-research-highlights-in-3-sentences-2a1 3 comments 10/6/2023 machinelearning

Linking pages

Linked pages

[2305.13048] RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/abs/2305.13048 171 comments
[2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs https://arxiv.org/abs/2305.14314 129 comments
[2305.15717] The False Promise of Imitating Proprietary LLMs https://arxiv.org/abs/2305.15717 119 comments
[2305.11206] LIMA: Less Is More for Alignment https://arxiv.org/abs/2305.11206 44 comments
[2305.14342] Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training https://arxiv.org/abs/2305.14342 8 comments
[2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
AI Research Highlights In 3 Sentences Or Less (April-May 2023) https://magazine.sebastianraschka.com/p/ai-research-highlights-in-3-sentences 4 comments
[2305.19466] The Impact of Positional Encoding on Length Generalization in Transformers https://arxiv.org/abs/2305.19466 4 comments
[2305.03047] Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision https://arxiv.org/abs/2305.03047 3 comments
[2306.03078] SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression https://arxiv.org/abs/2306.03078 2 comments
[2305.10429] DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining https://arxiv.org/abs/2305.10429 2 comments
[2305.14201] Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks https://arxiv.org/abs/2305.14201 1 comment
[2305.17333] Fine-Tuning Language Models with Just Forward Passes https://arxiv.org/abs/2305.17333 1 comment
[2305.15334] Gorilla: Large Language Model Connected with Massive APIs https://arxiv.org/abs/2305.15334 0 comments
[2305.13230] To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis https://arxiv.org/abs/2305.13230 0 comments
[2305.19370] Blockwise Parallel Transformer for Long Context Large Models https://arxiv.org/abs/2305.19370 0 comments
[2306.01116] The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only https://arxiv.org/abs/2306.01116 0 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:magazine.sebastianraschka.com

Search title: AI Research Highlights In 3 Sentences Or Less (May-June 2023)

See how to search.

Submit link to: