NLP Research in the Era of LLMs - by Sebastian Ruder - discu.eu

Hacker News

NLP Research in the Era of LLMs https://nlpnewsletter.substack.com/p/nlp-research-in-the-era-of-llms 17 comments 22/12/2023

Linked pages

In a Big Network of Computers, Evidence of Machine Learning - The New York Times http://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html?_r=4&amp%3Bpagewanted=1 617 comments
The Bitter Lesson http://incompleteideas.net/IncIdeas/BitterLesson.html 373 comments
[2304.06035] Choose Your Weapon: Survival Strategies for Depressed AI Academics https://arxiv.org/abs/2304.06035 318 comments
[2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs https://arxiv.org/abs/2305.14314 129 comments
Phi-2: The surprising power of small language models - Microsoft Research https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/ 121 comments
[2305.12544] A PhD Student's Perspective on Research in NLP in the Era of Very Large Language Models https://arxiv.org/abs/2305.12544 63 comments
[2305.10403] PaLM 2 Technical Report https://arxiv.org/abs/2305.10403 36 comments
Smerity.com: The compute and data moats are dead http://smerity.com/articles/2018/limited_compute.html 23 comments
[2106.09685] LoRA: Low-Rank Adaptation of Large Language Models https://arxiv.org/abs/2106.09685 8 comments
[2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
[2305.16264] Scaling Data-Constrained Language Models https://arxiv.org/abs/2305.16264 5 comments
[2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
Fine-tune Llama 2 with DPO https://huggingface.co/blog/dpo-trl 2 comments
Holistic Evaluation of Language Models (HELM) https://crfm.stanford.edu/helm/latest/ 1 comment
[2310.20633] Defining a New NLP Playground https://arxiv.org/abs/2310.20633 1 comment
[2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
[2001.08361] Scaling Laws for Neural Language Models https://arxiv.org/abs/2001.08361 0 comments
[1802.03268] Efficient Neural Architecture Search via Parameter Sharing https://arxiv.org/abs/1802.03268 0 comments
Requests for Research http://ruder.io/requests-for-research/ 0 comments
[2009.03300] Measuring Massive Multitask Language Understanding https://arxiv.org/abs/2009.03300 0 comments

Related searches:

Search whole site: site:nlpnewsletter.substack.com

Search title: NLP Research in the Era of LLMs - by Sebastian Ruder

See how to search.

Submit link to: