Google has launched its artificial intelligence (AI) voice assistant, Gemini Live. It’s a paid service that supports OpenAI’s GPT-4o ‘Advanced Voice Mode’.
It also adds a free overlay feature that brings up Gemini whilst you’re using your phone, to counter Apple’s upcoming Siri upgrade.
Google unveiled its recent product lineup, including Gemini Live and the brand new Pixel phone, on the ‘Made by Google 2024’ event on the thirteenth (local time).
Gemini Live is an enhanced version of Gemini with advanced voice features. Users describe it as allowing “in-depth voice chats in your phone.”
Google claims that Gemini delivers more consistent, emotionally expressive, and realistic conversations due to its improved voice engine. It could actually interrupt and ask questions during conversations, and it adapts to the user’s speech patterns in real time. It also has 10 voices to select from.
It also runs on the Gemini app, so you possibly can keep talking while using other apps or when your phone is locked, and you possibly can pause and resume conversations at any time.
Regarding the model utilized in Gemini Live, he said, “We have now incorporated recent models akin to the Gemini 1.5 Flash, which provides faster and better quality response.”
Due to this fact, Gemini Live has higher memory and may hold longer conversations than GPT-4o’s advanced speech mode. GPT-4o’s context window is 128,000 tokens, while Gemini 1.5 Flash has 1 million tokens. In theory, it could possibly accept conversations that last several hours.
Alternatively, it is understood that it doesn’t yet have multimodal capabilities, meaning that you just cannot have a conversation while shooting the environment with the phone camera. Google said that multimodal input shall be “launched later this 12 months,” but didn’t provide specifics.
Also, Gemini Live is currently only available in English, even though it has announced plans to expand to additional languages in the approaching weeks. It can even be available on iOS later this 12 months as an upgraded Gemini app.
It’s a paid service that costs $20 monthly, like GPT-4o Advanced Voice Mode. The service began on at the present time.
But Google can even be adding recent free features to Gemini in the approaching weeks.
Android users can summon Gemini while using any app and ask it questions on what’s happening on their screen—for instance, while watching YouTube. This may be done by pressing the phone’s power button or saying “Hey Google.” Gemini can then generate a picture as an overlay.
We also plan to integrate many of the phone’s functions into Gemini Control, including timers, alarms, media controls, flashlights, volume controls, Wi-Fi, and Bluetooth.
It is a function that responds to Apple’s integration of ‘Apple Intelligence’ into the iPhone 16, allowing the voice assistant ‘Siri’ to know and process what is occurring on the device.
Meanwhile, Google said on at the present time It unveiled a lineup of recent products, including the brand new Pixel 9 phone and Pixel Watch 3, all of which feature significant introduction of Gemini features.
“We’ve reached a tipping point where we consider the advantages of AI-powered assistants far outweigh the challenges,” said Sisi Xiao, VP of Google Assistant. “We’re still within the early stages of discovering all of the ways AI-powered assistants might help, and Gemini will only improve.”
Reporter Im Dae-jun ydj@aitimes.com