Google announces Gemini 3.5 Live Translate for instant voice-to-voice translation

Google has launched Gemini 3.5 Live Translate, a new AI model offering instant voice-to-voice translation in over 70 languages with low latency. This technology aims to make real-time translation more accessible across Google’s ecosystem and within the Google Translate app. The model, which is part of the Gemini 3.5 family, can maintain natural conversational flow, matching intonation and pacing, and will be available to developers, enterprise customers, and Google Translate users on both Android and iOS.
Google has unveiled Gemini 3.5 Live Translate, a sophisticated AI model designed to provide instant voice-to-voice translation. This new technology extends Google’s long-standing pursuit of real-time translation, which the company has described as a pioneering machine learning endeavor. It significantly enhances availability and reduces latency compared to previous translations tools.
The Gemini 3.5 Live Translate model is specifically tuned for speech-to-speech translation, supporting over 70 languages. It is engineered to detect and translate languages seamlessly, even filtering out background noise in busy settings. Developers can start integrating the public preview through the Gemini Live API or AI Studio.
One of the key advancements is the model's ability to maintain the natural flow of a conversation. Google states that Gemini 3.5 Live Translate is fast enough to keep pace with normal speech, matching the speaker's intonation, pacing, and pitch, resulting in a more lifelike voice output.
This new translation capability is being rolled out across various components of the Google ecosystem. Select enterprise customers will gain access to it in Google Meet starting this month. Most notably, the 3.5 Live Translate feature will soon be integrated into the Google Translate app for both Android and iOS users.
The updated Google Translate app will allow users to experience real-time translation with any earbuds, or even without them. Android users will also have a "listening mode" where they can hold their phone to their ear for spoken translations. All audio streams generated by Gemini 3.5 Live Translate will include SynthID watermarks, indicating their AI-generated origin.
Related articles
Build real agentic apps using CUGA: two dozen working examples on a lightweight harness
CUGA, IBM's open-source Agent Harness, simplifies building agentic applications by handling infrastructure, allowing developers to focus on tools and prompts. It offers pre-assembled components for planning, execution, and state management, significantly reducing development time. CUGA has topped agent benchmarks like AppWorld and WebArena.
OpenAI launches new initiative to help find and patch open source bugs
OpenAI has launched "Patch the Planet," a new initiative in partnership with cybersecurity firm Trail of Bits, to enhance the security of open-source projects. This program aims to assist maintainers in identifying and patching bugs, utilizing OpenAI's AI-powered security tools while reducing the burden on project teams.
PP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M Parameters
Baidu has released PP-OCRv6, an advanced optical character recognition (OCR) model supporting 50 languages. Available on Hugging Face, this version significantly improves accuracy and efficiency across various parameter sizes, from 1.5 million to 34.5 million, marking a substantial leap in multilingual OCR technology.
