Google’s Genie world model can now simulate real streets with Street View
Google DeepMind has integrated Street View with its Genie world model, enabling realistic simulations of real-world environments for various applications. This integration allows users to interact with simulated streets, adjust conditions like weather, and train AI agents in diverse scenarios.
Google DeepMind has integrated Street View with its Genie world model, allowing for realistic simulations of real-world environments. This new feature, launched during the Google I/O developer conference, enables users to interact with simulated streets, adjust conditions like weather, and even explore hypothetical scenarios. This advancement aims to be beneficial for both AI agent and robotics applications, as well as for human users.
One of the key applications of this integration is in robotics training. For example, Genie can simulate rare weather events or environmental conditions, preparing robots for unpredictable real-world scenarios. Similarly, users can explore different locations under varied conditions, such as seeing New York City streets in the snow, offering a dynamic and immersive experience beyond static imagery.
Google has accumulated an extensive Street View dataset over two decades, comprising over 280 billion images across 110 countries. This vast collection of real-world data, combined with Genie's ability to simulate worlds, offers powerful potential for diverse applications, from educational experiences and gaming to advanced robotics training.
While impressive, the technology is still in its experimental phase. The simulations currently offer video game-quality visuals rather than photorealistic ones, and the models are not yet fully physics-aware. However, researchers are actively working on improving accuracy and quality, anticipating significant advancements in the coming months.
Related articles
Build real agentic apps using CUGA: two dozen working examples on a lightweight harness
CUGA, IBM's open-source Agent Harness, simplifies building agentic applications by handling infrastructure, allowing developers to focus on tools and prompts. It offers pre-assembled components for planning, execution, and state management, significantly reducing development time. CUGA has topped agent benchmarks like AppWorld and WebArena.
OpenAI launches new initiative to help find and patch open source bugs
OpenAI has launched "Patch the Planet," a new initiative in partnership with cybersecurity firm Trail of Bits, to enhance the security of open-source projects. This program aims to assist maintainers in identifying and patching bugs, utilizing OpenAI's AI-powered security tools while reducing the burden on project teams.
PP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M Parameters
Baidu has released PP-OCRv6, an advanced optical character recognition (OCR) model supporting 50 languages. Available on Hugging Face, this version significantly improves accuracy and efficiency across various parameter sizes, from 1.5 million to 34.5 million, marking a substantial leap in multilingual OCR technology.
