GitHub - HenryHZY/Awesome-Multimodal-LLM: Research Trends in LLM-guided Multimodal Learning. - discu.eu

Reddit

[R] Research Trends in LLM-guided Multimodal Learning. https://github.com/HenryHZY/Awesome-Multimodal-LLM 7 comments 1/6/2023 machinelearning

Linked pages

[2302.14045] Language Is Not All You Need: Aligning Perception with Language Models https://arxiv.org/abs/2302.14045 115 comments
[2303.16199] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention https://arxiv.org/abs/2303.16199 52 comments
[2305.06500] InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning https://arxiv.org/abs/2305.06500 5 comments
[2302.00923] Multimodal Chain-of-Thought Reasoning in Language Models https://arxiv.org/abs/2302.00923 3 comments
[2106.13884] Multimodal Few-Shot Learning with Frozen Language Models https://arxiv.org/abs/2106.13884 2 comments
GitHub - ZrrSkywalker/LLaMA-Adapter: Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters https://github.com/ZrrSkywalker/LLaMA-Adapter 1 comment
GitHub - open-mmlab/Multimodal-GPT: Multimodal-GPT https://github.com/open-mmlab/Multimodal-GPT 1 comment
[2304.08485] Visual Instruction Tuning https://arxiv.org/abs/2304.08485 1 comment
[2301.12597] BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models https://arxiv.org/abs/2301.12597 0 comments
LAVIS/projects/blip2 at main · salesforce/LAVIS · GitHub https://github.com/salesforce/LAVIS/tree/main/projects/blip2 0 comments
GitHub - amazon-science/mm-cot: Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated) https://github.com/amazon-science/mm-cot 0 comments
GitHub - mlfoundations/open_flamingo: An open-source framework for training large multimodal models https://github.com/mlfoundations/open_flamingo 0 comments
GitHub - OpenGVLab/Ask-Anything: a simple yet interesting tool for chatting about video with chatGPT, miniGPT4 and StableLM https://github.com/OpenGVLab/Ask-Anything 0 comments
[2305.15023] Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models https://arxiv.org/abs/2305.15023 0 comments
GitHub - haotian-liu/LLaVA: Visual Instruction Tuning: Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities. https://github.com/haotian-liu/LLaVA 0 comments

Would you like to stay up to date with Computer science? Checkout Computer science Weekly.

Related searches:

Search whole site: site:github.com

Search title: GitHub - HenryHZY/Awesome-Multimodal-LLM: Research Trends in LLM-guided Multimodal Learning.

See how to search.

Submit link to: