Rene Ramos/ZDNET

Follow ZDNET: Add us as a preferred source on Google.


ZDNET’s key takeaways

  • Google Omni aims to do for video what Nano Banana did for images.
  • Creators can build videos from text, images, audio, or video.
  • AI avatars could help creators, but raise trust concerns.

Today, Google announced a new AI video capability that will either help creatives produce higher-quality videos more easily, or vastly increase the amount of AI slop on YouTube. I’m betting it’ll be a mix of both.

Google announced Gemini Omni, a tool that raises the ability to create video via AI to an entirely new level. The company compared this announcement to the level of AI image generation improvement that came about when it released Nano Banana.

Also: Google I/O 2026 live: Latest updates

Nano Banana raised the bar considerably on what was possible with image generation. Omni purports to do the same with video. Omni will be rolling out starting today, but I didn’t have a chance to play with it prior to the announcement.

Google described Omni as “where Gemini’s ability to reason meets the ability to create.” Interestingly, according to the company, “With Omni, you can combine images, audio, video and text as input and generate high-quality videos grounded in Gemini’s real-world knowledge.”

Although Omni is “starting with video,” Google said the new model can “create anything from any input,” so presumably we’ll see other media types generated by the tool within due time.

Also: 6 Android Auto apps I wish I found sooner, because they make every drive easier

Omni will also be available in model tiers, starting now with Gemini Omni Flash. The capability is coming to the Gemini app, Google Flow, and YouTube Shorts. It’s not clear whether the web version of Gemini will support Omni, or whether you’ll need to use the Flow interface via your browser.

There are some standout features that make this a very interesting offering.

Clone yourself

I honestly can’t decide if this is going to be a standout feature, a very big concern for privacy, or an untethered slop generator. The company said you can create videos “with your own voice by using Avatars, which create a digital version of yourself so you can generate videos that look and sound like you.”

Also: I used Nano Banana 2 to make perfect sketchnotes: 5 lessons learned

As a regular producer of YouTube videos for my channel, I’m intrigued. There have been times when I wanted to put out a video, but was having a bad hair day, a bad voice day, or a bad attitude day, and I just didn’t want that to come across in video.

Could I just feed a script into my digital twin avatar and have RoboDave do the talking? Would my audience notice? Would they care? Would they hate it? Would I? Clearly that’s an area worthy of experimentation, but it’s probably not something I’ll use often.

I do my YouTube channel, in part, to keep my speaking and presentation chops up. Foisting that work on a digital avatar might reduce my workload, but it would also reduce my training and practice.

Google is very careful to say that it’s incorporating its SynthID digital fingerprinting technology in these videos, so they can be verified as having been produced with Omni. Google also said, “Beyond the avatar feature, in terms of editing videos to change audio and speech, we are still working to test this and better understand how we can bring this capability to users responsibly.”

Physics model

Some of you may remember the early days of video games, when characters behaved more like ragdolls than objects in the physical world. As games got better, they began to incorporate physics models, so if something got shot, knocked back, or dropped, it did so in a matter consistent with the physics of the object.

Omni now incorporates physics into the videos it creates. Google said it has “an improved intuitive understanding of forces like gravity, kinetic energy, and fluid dynamics.” It also uses Gemini’s knowledge to “connect language, imagery, and meaning in ways that go far beyond pattern matching.”

Also: OpenAI’s new image watermarks make it easier to spot AI fakes – here’s how

The company said Omni can build detailed videos from short prompts and can generate videos for things like explainers that break down fairly complex ideas. I don’t doubt this. The analysis capabilities of NotebookLM’s audio overview and video overview to be able to create explainers are astonishing. If some of that technology found its way into Omni, things could get interesting quickly.

I actually fed marketing documents and spec sheets into NotebookLM and it produced explainer videos for various features of my security product that were better than anything I could have done by hand, especially in the time it took. The visuals at the time weren’t great, but having complex features explained in a clean video in under 30 minutes was a force-multiplier for my product release schedule.

Input variety

One of Nano Banana’s early standout features was its ability to recontextualize an image. For example, I had it take a picture of me walking in a park and change it so I was wearing something close to an admiral’s uniform on the bridge of an aircraft carrier. While it didn’t get the uniform fruit salad and brass quite right, it did manage to accurately reproduce my body and face.

Also: I turned casual selfies into professional headshots with Gemini

Omni proposes to take that to video, turning image, text, video, or audio into a “cohesive output.” Right now, the only audio it will accept is voice recordings, but the company said it’ll “roll out other types of audio inputs soon.”

The company also said you can create scenes, match styles, describe what you want in natural language, and get character consistency throughout the video.

Conversational editing

One aspect of producing videos I do not enjoy is the editing process. It’s often enormously tedious. But, with Omni, “Gemini Omni gives you an easier way to edit video – with natural language. Every instruction builds on the last. Your characters stay consistent, the physics hold up and the scene remembers what came before.”

Google also said you can change elements in the video. I can see a huge benefit if it’s possible to import a video and have the editor remove obstructions or change objects and backgrounds. It’s not clear how long a clip can be, or exactly how much editing you can do with Omni on a given plan, but those possibilities are exciting.

Also: Are Sora 2 and other AI video tools risky to use? Here’s what a legal scholar says

Two other transformations the company said the new Omni can do are:

  • Change specific things, or change everything. Your video becomes the starting point for something you never could have filmed yourself.
  • Take a video you shot and just ask Omni to change what’s happening. Edit the action, add in new characters or objects, or transform a moment into something unexpected.

Additionally, Google hasn’t yet specified video format or resolution. Will this be a professional tool that can handle 16:9 videos in 4K or 8K resolution, or is it meant to be a tool for the YouTube Shorts generation?

When OpenAI introduced Sora, it was a novelty. While users abused it (we gave Sam Altman blue hair and made him sing ZDNET’s praise), it never managed to be a tool that helped a professional’s workflow.

While producing AI avatar clones and replacing objects might be fun, I’m hoping this capability is extended so that it’s usable either inside Final Cut, Premiere Pro, and DaVinci Resolve, or at least integrated enough that those tools can use edits created by Omni.

It’s possible. Omni’s features will be rolling out to enterprise customers and developers via a Google API.

Also: OpenAI’s new image watermarks make it easier to spot AI fakes – here’s how

I’m also curious if Omni will embed the little diamond watermark in the corner of its videos, like it does with Nano Banana’s generated images. While it’s nice to know a clip was generated by AI, such watermarking gets in the way of using the AI as a professional tool.

Will we see licensing tiers where the watermark can be removed? Or will we see third-party tools crop up that remove the watermark, whether Google wants you to or not? Time will tell.

Would you use Google Omni to create a digital avatar of yourself for videos you didn’t want to record in person? Let us know in the comments below.


You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.





Source link