Un viaje a través de los modelos de visión y lenguaje: avances, aplicaciones y desafíos
Dec 14, 2023·
·
0 min read

Javiera Castillo Navarro

Abstract
Artificial intelligence (AI) is revolutionizing the way we live. However, there is still a long way to go before we can consider AI as an intelligence similar to our own. Humans explore the world through our five senses, our perception and learning mode is inherently multi-modal. How can we move towards AI models that learn in a more “human-like” way? Vision-language models (VLMs) represent a big step in this direction. These are models at the intersection of computer vision and natural language processing, capable of modeling images, text and their relationships. Current VLMs allow our computers to understand, generate descriptions or answer questions about images in a manner similar to humans. In this tutorial we will take a walk through the history and development of current vision and language models, explore existing architectures, training practices and their various applications.
Date
Dec 14, 2023 12:00 AM
Event
Location
Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Santiago, Chile.