Linking pages
- Long-Context Multimodal Understanding No Longer Requires Massive Models: NVIDIA AI Introduces Eagle 2.5, a Generalist Vision-Language Model that Matches GPT-4o on Video Tasks Using Just 8B Parameters - MarkTechPost https://www.marktechpost.com/2025/04/21/long-context-multimodal-understanding-no-longer-requires-massive-models-nvidia-ai-introduces-eagle-2-5-a-generalist-vision-language-model-that-matches-gpt-4o-on-video-tasks-using-just-8b-parameters/ 1 comment
Linked pages
- Terms of use https://openai.com/policies/terms-of-use#:~:text=use%20output%20from%20the%20Services%20to%20develop%20models%20that%20compete%20with%20OpenAI 126 comments
- Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs https://cambrian-mllm.github.io/ 1 comment
- GitHub - haotian-liu/LLaVA: Visual Instruction Tuning: Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities. https://github.com/haotian-liu/LLaVA 0 comments
- [2408.15998] Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders https://arxiv.org/abs/2408.15998 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - NVlabs/EAGLE: EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
See how to search.