Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions

Linking pages

RLHF learning resources in 2024 - by Nathan Lambert https://www.interconnects.ai/p/rlhf-resources 0 comments

Linked pages

Mobile ALOHA https://mobile-aloha.github.io/ 51 comments
[2401.02385] TinyLlama: An Open-Source Small Language Model https://arxiv.org/abs/2401.02385 44 comments
[2305.06500] InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning https://arxiv.org/abs/2305.06500 5 comments
Introducing CM3leon, a more efficient, state-of-the-art generative model for text and images https://ai.meta.com/blog/generative-ai-text-images-cm3leon/ 3 comments
[2311.12908] Diffusion Model Alignment Using Direct Preference Optimization https://arxiv.org/abs/2311.12908 2 comments
Tom Scott, and the formidable power of escalating streaks https://simonwillison.net/2024/Jan/2/escalating-streaks/ 2 comments
[2304.08485] Visual Instruction Tuning https://arxiv.org/abs/2304.08485 1 comment
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action https://unified-io-2.allenai.org/ 1 comment
[2401.02415] LLaMA Pro: Progressive LLaMA with Block Expansion https://arxiv.org/abs/2401.02415 1 comment
Apple’s LLM gap is real. It might not last much longer. — Joan Westenberg https://joanwestenberg.com/blog/apples-llm-gap-is-real-it-might-not-last-much-longer 1 comment
[2312.00785] Sequential Modeling Enables Scalable Learning for Large Vision Models https://arxiv.org/abs/2312.00785 0 comments
SafetyPrompts.com https://safetyprompts.com/ 0 comments
The Retort AI Podcast | AI is literally the culture war, figuratively speaking https://retortai.com/episodes/ai-is-literally-the-culture-war-figuratively-speaking 0 comments
[2312.17172] Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action https://arxiv.org/abs/2312.17172 0 comments