LLM Research Papers: The 2024 List - discu.eu

Reddit

[P] Curated list of LLM papers 2024 https://magazine.sebastianraschka.com/p/llm-research-papers-the-2024-list 11 comments 14/12/2024 machinelearning

Linking pages

Noteworthy AI Research Papers of 2024 (Part One) https://magazine.sebastianraschka.com/p/ai-research-papers-2024-part-1 21 comments

Linked pages

[2402.17764] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits https://arxiv.org/abs/2402.17764 575 comments
[2410.05229] GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models https://arxiv.org/abs/2410.05229 267 comments
[2410.01201] Were RNNs All We Needed? https://arxiv.org/abs/2410.01201 260 comments
[2406.05587] Creativity Has Left the Chat: The Price of Debiasing Language Models https://arxiv.org/abs/2406.05587 251 comments
[2410.21333] Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse https://arxiv.org/abs/2410.21333 250 comments
[2410.05258] Differential Transformer https://arxiv.org/abs/2410.05258 218 comments
[2402.05120] More Agents Is All You Need https://arxiv.org/abs/2402.05120 206 comments
[2407.02678] Reasoning in Large Language Models: A Geometric Perspective https://arxiv.org/abs/2407.02678 170 comments
[2402.04494] Grandmaster-Level Chess Without Search https://arxiv.org/abs/2402.04494 168 comments
[2401.04088] Mixtral of Experts https://arxiv.org/abs/2401.04088 150 comments
[2410.02707] LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations https://arxiv.org/abs/2410.02707 140 comments
[2404.19737] Better & Faster Large Language Models via Multi-token Prediction https://arxiv.org/abs/2404.19737 132 comments
[2404.14219] Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone https://arxiv.org/abs/2404.14219 130 comments
[2410.00907] Addition is All You Need for Energy-efficient Language Models https://arxiv.org/abs/2410.00907 127 comments
[2403.04732] How Far Are We from Intelligent Visual Deductive Reasoning? https://arxiv.org/abs/2403.04732 118 comments
[2403.05440] Is Cosine-Similarity of Embeddings Really About Similarity? https://arxiv.org/abs/2403.05440 115 comments
[2405.04517] xLSTM: Extended Long Short-Term Memory https://arxiv.org/abs/2405.04517 114 comments
[2404.02258] Mixture-of-Depths: Dynamically allocating compute in transformer-based language models https://arxiv.org/abs/2404.02258 103 comments
[2401.12070] Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text https://arxiv.org/abs/2401.12070 99 comments
GitHub - rasbt/LLMs-from-scratch: Implement a ChatGPT-like LLM in PyTorch from scratch, step by step https://github.com/rasbt/LLMs-from-scratch 98 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:magazine.sebastianraschka.com

Search title: LLM Research Papers: The 2024 List

See how to search.

Submit link to: