[2305.06500] InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning - discu.eu

Reddit

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning https://arxiv.org/abs/2305.06500 5 comments 14/5/2023 machinelearning

Linking pages

GitHub - WooooDyy/LLM-Agent-Paper-List: The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al. https://github.com/WooooDyy/LLM-Agent-Paper-List 28 comments
GitHub - huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. https://github.com/huggingface/transformers 26 comments
GitHub - HenryHZY/Awesome-Multimodal-LLM: Research Trends in LLM-guided Multimodal Learning. https://github.com/HenryHZY/Awesome-Multimodal-LLM 7 comments
Multimodality and Large Multimodal Models (LMMs) https://huyenchip.com/2023/10/10/multimodal.html 0 comments
Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions https://www.interconnects.ai/p/multimodal-rlhf 0 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:arxiv.org

Search title: [2305.06500] InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

See how to search.

Submit link to: