Linking pages
- Vid2Seq: a pretrained visual language model for describing multi-event videos – Google AI Blog https://ai.googleblog.com/2023/03/vid2seq-pretrained-visual-language.html 16 comments
- Google Research, 2022 & beyond: Language, vision and generative models – Google AI Blog https://ai.googleblog.com/2023/01/google-research-2022-beyond-language.html 5 comments
- Google at ICLR 2022 – Google AI Blog https://ai.googleblog.com/2022/04/google-at-iclr-2022.html 0 comments
- Foundation Models and the Future of Multi-Modal AI https://lastweekin.ai/p/multi-modal-ai 0 comments
Linked pages
- Autoregressive model - Wikipedia http://en.wikipedia.org/wiki/Autoregressive_model#Derivation 11 comments
- ICLR 2023 https://iclr.cc 10 comments
- [2109.10852] Pix2seq: A Language Modeling Framework for Object Detection https://arxiv.org/abs/2109.10852 8 comments
- COCO - Common Objects in Context http://cocodataset.org 2 comments
- Generalized Intersection over Union https://giou.stanford.edu 0 comments
- Object detection - Wikipedia https://en.wikipedia.org/wiki/Object_detection 0 comments
Related searches:
Search whole site: site:ai.googleblog.com
Search title: Pix2Seq: A New Language Interface for Object Detection – Google AI Blog
See how to search.