Hacker News
- NLP Research in the Era of LLMs https://nlpnewsletter.substack.com/p/nlp-research-in-the-era-of-llms 17 comments
Linked pages
- In a Big Network of Computers, Evidence of Machine Learning - The New York Times http://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html?_r=4&%3Bpagewanted=1 617 comments
- The Bitter Lesson http://incompleteideas.net/IncIdeas/BitterLesson.html 366 comments
- [2304.06035] Choose Your Weapon: Survival Strategies for Depressed AI Academics https://arxiv.org/abs/2304.06035 318 comments
- [2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs https://arxiv.org/abs/2305.14314 129 comments
- Phi-2: The surprising power of small language models - Microsoft Research https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/ 121 comments
- [2305.12544] A PhD Student's Perspective on Research in NLP in the Era of Very Large Language Models https://arxiv.org/abs/2305.12544 63 comments
- [2305.10403] PaLM 2 Technical Report https://arxiv.org/abs/2305.10403 36 comments
- Smerity.com: The compute and data moats are dead http://smerity.com/articles/2018/limited_compute.html 23 comments
- [2106.09685] LoRA: Low-Rank Adaptation of Large Language Models https://arxiv.org/abs/2106.09685 8 comments
- [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model https://arxiv.org/abs/2305.18290 8 comments
- [2305.16264] Scaling Data-Constrained Language Models https://arxiv.org/abs/2305.16264 5 comments
- [2205.14135] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness https://arxiv.org/abs/2205.14135 3 comments
- Fine-tune Llama 2 with DPO https://huggingface.co/blog/dpo-trl 2 comments
- Holistic Evaluation of Language Models (HELM) https://crfm.stanford.edu/helm/latest/ 1 comment
- [2310.20633] Defining a New NLP Playground https://arxiv.org/abs/2310.20633 1 comment
- [2203.15556] Training Compute-Optimal Large Language Models https://arxiv.org/abs/2203.15556 0 comments
- [2001.08361] Scaling Laws for Neural Language Models https://arxiv.org/abs/2001.08361 0 comments
- [1802.03268] Efficient Neural Architecture Search via Parameter Sharing https://arxiv.org/abs/1802.03268 0 comments
- Requests for Research http://ruder.io/requests-for-research/ 0 comments
- [2009.03300] Measuring Massive Multitask Language Understanding https://arxiv.org/abs/2009.03300 0 comments
Related searches:
Search whole site: site:nlpnewsletter.substack.com
Search title: NLP Research in the Era of LLMs - by Sebastian Ruder
See how to search.