Linking pages
- AI Canon | Andreessen Horowitz https://a16z.com/2023/05/25/ai-canon/ 219 comments
- Open challenges in LLM research https://huyenchip.com/2023/08/16/llm-research-open-challenges.html 72 comments
- Bringing LLM Fine-Tuning and RLHF to Everyone https://argilla.io/blog/argilla-for-llms/ 11 comments
- MLOps guide https://huyenchip.com/mlops/ 3 comments
- Chewing On AI Privacy Scenarios | Drew Breunig https://www.dbreunig.com/2023/05/15/ai-privacy-scenarios.html 0 comments
- Unfortunately, OpenAI and Google have moats https://www.interconnects.ai/p/openai-google-llm-moats 0 comments
- Effective ChatGPT Prompting for software developers https://boliv.substack.com/p/effective-chatgpt-prompting-for-software 0 comments
- Effective ChatGPT Prompting for software developers https://boliv.substack.com/p/effective-chatgpt-prompting-for-software?sd=pf 0 comments
- ML pipelines for fine-tuning LLMs | Dagster Blog https://dagster.io/blog/finetuning-llms 0 comments
- Multimodality and Large Multimodal Models (LMMs) https://huyenchip.com/2023/10/10/multimodal.html 0 comments
- GitHub - nlpfromscratch/nlp-llms-resources: Master list of curated resources on NLP and LLMs https://github.com/nlpfromscratch/nlp-llms-resources 0 comments
- RLHF learning resources in 2024 - by Nathan Lambert https://www.interconnects.ai/p/rlhf-resources 0 comments
- Unbowed, Unbent, Unbroken – Decoder Only https://decoderonlyblog.wordpress.com/2024/04/19/unbowed-unbent-unbroken/ 0 comments
- Train RLHF Models with Dagster and Modal: Step-by-Step Guide https://kyrylai.com/2024/06/10/rlhf-with-dagster-and-modal/ 0 comments
- A Heuristic Proof of Practical Aligned Superintelligence https://transhumanaxiology.substack.com/p/a-heuristic-proof-of-practical-aligned 0 comments
- What is the difference between a novelist and a language model? https://walfred.substack.com/p/what-is-the-difference-between-a 0 comments
Linked pages
- [2005.14165] Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 201 comments
- Data API Terms - Reddit https://www.redditinc.com/policies/data-api-terms 151 comments
- [2110.10819] Shaking the foundations: delusions in sequence models for interaction and control https://arxiv.org/abs/2110.10819 6 comments
- John Schulman - Reinforcement Learning from Human Feedback: Progress and Challenges - YouTube https://www.youtube.com/watch?v=hhiLw5Q_UFg 3 comments
- GitHub - tatsu-lab/stanford_alpaca https://github.com/tatsu-lab/stanford_alpaca 2 comments
- [2211.04325] Will we run out of data? Limits of LLM scaling based on human-generated data https://arxiv.org/abs/2211.04325 1 comment
- [2204.05862] Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback https://arxiv.org/abs/2204.05862 1 comment
- Aligning language models to follow instructions https://openai.com/research/instruction-following 1 comment
- https://arxiv.org/abs/2203.02155 0 comments
- GitHub - togethercomputer/RedPajama-Data: The RedPajama-Data repository contains code for preparing large datasets for training large language models. https://github.com/togethercomputer/RedPajama-Data 0 comments
- rl-for-llms.md · GitHub https://gist.github.com/yoavg/6bff0fecd65950898eba1bb321cfbd81 0 comments
- [2112.11446] Scaling Language Models: Methods, Analysis & Insights from Training Gopher https://arxiv.org/abs/2112.11446 0 comments
- [2302.13971] LLaMA: Open and Efficient Foundation Language Models https://arxiv.org/abs/2302.13971 0 comments
Related searches:
Search whole site: site:huyenchip.com
Search title: RLHF: Reinforcement Learning from Human Feedback
See how to search.