Google DeepMind at ICML 2024

Latest approaches in generative AI and multimodality

Generative AI technologies and multimodal capabilities are expanding the creative possibilities of digital media.

We’ll present VideoPoet, which uses an LLM to generate state-of-the-art video and audio from multimodal inputs including images, text, audio and other video.

And share Genie (generative interactive environments), which may generate a spread of playable environments for training AI agents, based on text prompts, images, photos, or sketches.

Finally, we introduce MagicLens, a novel image retrieval system that uses text instructions to retrieve images with richer relations beyond visual similarity.

Supporting the AI community

We’re proud to sponsor ICML and foster a various community in AI and machine learning by supporting initiatives led by Disability in AI, Queer in AI, LatinX in AI and Women in Machine Learning.

Should you’re on the conference, visit the Google DeepMind and Google Research booths to fulfill our teams, see live demos and discover more about our research.

Source link

Google DeepMind at ICML 2024

Latest approaches in generative AI and multimodality

Supporting the AI community

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Escaping the SQL Jungle

A Gentle Introduction to Nonlinear Constrained Optimization with Piecewise Linear Approximations

Agentic RAG Failure Modes: Retrieval Thrash, Tool Storms, and Context Bloat (and How you can Spot Them Early)

Learn how to Measure AI Value

Constructing Robust Credit Scoring Models (Part 3)

Google DeepMind at ICML 2024

Latest approaches in generative AI and multimodality

Supporting the AI community

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.