Follow ZDNET: Add us as a favorite source On Google.
ZDNET Highlights
- Google Omni aims to do for video what Nano Banana did for images.
- Creators can create videos from text, images, audio, or video.
- AI avatars could help creators, but could raise trust concerns.
Today, Google announced a new AI video capability that will either help creatives make high-quality videos more easily, or significantly increase the amount of AI slop on YouTube. I’m betting it’ll be a mix of both.
Google Gemini Omni announcedA tool that elevates the ability to create videos through AI to a whole new level. The company compared this announcement to the level of AI image generation improvements that came with the release of Nano Banana.
Also: Google I/O 2026 live: latest updates
The Nano Banana greatly raised the level of what was possible with image generation. Omni intends to do the same with video. The Omni will be available starting today, but I didn’t get a chance to play with it before the announcement.
Google describes Omni as “where Gemini’s ability to reason meets their ability to create.” Interestingly, according to the company, “With Omni, you can combine images, audio, video, and text as inputs and create high-quality videos based on Gemini’s real-world knowledge.”
Although Omni is “starting with video”, Google said the new model can “create anything from any input”, so we’ll likely see other media types generated by the tool in due course.
Also: 6 Android Auto apps I wish I’d found sooner, because they make every drive easier
The Omni will also be available in model levels, starting with the Gemini Omni Flash. This capability is coming to the Gemini app, Google Flow, and YouTube Shorts. It’s unclear whether the web version of Gemini will support Omni, or whether you’ll need to use the Flow interface through your browser.
It has some extraordinary features that make it a very interesting offering.
clone yourself
I honestly can’t decide if this would be a standout feature, a huge concern for privacy, or an untethered slop generator. The company said you can create videos with your voice “using an avatar, which creates a digital version of you so you can create videos that look and sound just like you.”
Also: I Used Nano Banana 2 to Create Perfect Sketchnotes: 5 Lessons Learned
As a regular creator of YouTube videos for my channel, I am interested in this. There were times when I wanted to put up a video, but I had bad hair, a bad voice, or a bad attitude, and I didn’t want it to come across in the video.
Can I put a script into my digital twin avatar and have RoboDev talk to it? Will my audience notice? Will they care? Will they hate it? can i? Obviously this is an area worth experimenting with, but it’s probably not something I’ll use often.
I do my YouTube channel, in part, to keep my speaking and presenting skills up to date. Offloading that work to a digital avatar might reduce my workload, but it would also reduce my training and practice.
Google is very careful to say that it is incorporating its SynthID digital fingerprinting technology into these videos, so that they can be verified as being produced with the Omni. Google also said, “Beyond the avatar feature, in terms of editing video to replace audio and speech, we are still working to test it and better understand how we can responsibly bring this capability to users.”
physics model
Some of you may remember the early days of video games, when characters behaved more like dolls than objects in the physical world. As games got better, they started incorporating physics models, so if something is shot, knocked back, or dropped, it conforms to the physics of the object.
Omni now incorporates physics into the videos it creates. Google said it has “a better intuitive understanding of forces like gravity, kinetic energy and fluid dynamics.” It uses Gemini’s knowledge to “link language, imagination, and meaning in ways that go far beyond pattern matching.”
Also: OpenAI’s new image watermark makes it easier to spot AI fakes – here’s how
The company said Omni can create detailed videos from small prompts and create videos for things like explainers that break down fairly complex ideas. I have no doubt about it. The analysis capabilities of NotebookLM’s audio overview and video overview to be able to create explainers are amazing. If some of that technology makes its way to the Omni, things could get interesting fast.
I actually put marketing documents and spec sheets into NotebookLM and it produced explainer videos for various features of my security product that were better than anything I could have done by hand, especially given the time it took. The visuals weren’t great at the time, but having complex features explained in a clean video in less than 30 minutes was a force-multiplier for my product release schedule.
input diversity
One of the early standout features of the Nano Banana was its ability to re-contextualize an image. For example, I took a photo of myself walking in the park and altered it so I was wearing something similar to an admiral’s uniform on the bridge of an aircraft carrier. Although it didn’t get the uniform fruit salad and brassiness exactly right, it managed to reproduce my body and face accurately.
Also: I turned casual selfies into professional headshots with Gemini
Omni proposes to take this to video by converting image, text, video or audio into a “coherent output”. Right now, the only audio it will accept is voice recordings, but the company said it will “soon release other types of audio inputs.”
The company also said you can create scenes, match styles, describe what you want in natural language and have character consistency throughout the video.
conversational editing
One aspect of making videos that I don’t like is the editing process. This is often extremely tiring. But, with Omni, “Gemini Omni gives you an easier way to edit video – with natural language. Each instruction builds on the last. Your characters stay consistent, physics hold up and the scene remembers what happened before.”
Google also said that you can change elements in the video. I can see a huge benefit if it’s possible to import a video and have the editor remove blockages or replace objects and backgrounds. It’s not clear how long a clip can be, or how much editing you can do with Omni on a given plan, but these possibilities are exciting.
Also: Is it risky to use Sora 2 and other AI video tools? This is what a legal scholar says
The company said the new Omni could make two other changes:
- Change specific things, or change everything. Your video becomes the starting point for something you might never have filmed yourself.
- Take a video you shot and ask Omni to change what’s happening. Edit the action, add new characters or objects, or transform a moment into something unexpected.
Additionally, Google has not yet specified the video format or resolution. Will this be a professional tool that can handle 16:9 videos in 4K or 8K resolution, or is it meant to be a tool for YouTube shorts generation?
When OpenAI introduced Sora, it was a novelty. While users abused it (we gave Sam Altman blue hair and made him sing ZDNET’s praises), it never managed to become a tool that helps any professional’s workflow.
While creating AI avatar clones and changing objects can be fun, I hope this capability is expanded so that it is usable inside Final Cut, Premiere Pro, and Davinci Resolve, or at least integrated enough that those tools can use the edits created by Omni.
It is possible. Omni’s features will be made available to enterprise customers and developers through Google APIs.
Also: OpenAI’s new image watermark makes it easier to spot AI fakes – here’s how
I’m also curious to know if Omni will embed the little diamond watermark in the corner of its videos, as it does with the generated images of the Nano banana. Although it’s nice to know that a clip was produced by AI, this kind of watermarking gets in the way of using AI as a professional tool.
Will we see licensing levels where watermarks can be removed? Or will we see third-party tools that will remove watermarks, whether Google wants you to or not? Only time will tell.
Would you use Google Omni to create your own digital avatar for videos you didn’t want to record in person? Let us know in the comments below.
You can follow my daily project updates on social media. Be sure to subscribe My weekly update newsletterAnd follow me on Twitter/X @davidgewirtzon facebook Facebook.com/DavidGewirtzon instagram Instagram.com/DavidGewirtzon bluesky @DavidGewirtz.comand on youtube YouTube.com/DavidGewirtzTV.
