Google Gemini AI Video: In a recent demo video, Google staged parts of Gemini viral duck video, a GPT-4 competitor. Google admitted that its video named "Hands-on with Gemini: Interacting with multimodal AI," was edited to expedite the outputs (which was declared in the video description).
However, there was no voice interaction between the human and the AI.
Rather than having Gemini respond to or predict a drawing or change in objects on the table in real time, the demo was made by "using still images from the footage and prompting via text." The video attempts to mislead the audience into believing Gemini's capability. The video is rather questionable due to the lack of disclaimers about how inputs are actually made.
“Really happy to see the interest around our “Hands-on with Gemini” video. In our developer blog yesterday, we broke down how Gemini was used to create it,” said, Oriol Vinyals, VP of Research & Deep Learning Lead, Google DeepMind. Gemini co-lead, in a post on X.
“We gave Gemini sequences of different modalities — image and text in this case — and had it respond by predicting what might come next. Devs can try similar things when access to Pro opens on 12/13 🚀. The knitting demo used Ultra,” Vinyals added.
“All the user prompts and outputs in the video are real, shortened for brevity. The video illustrates what the multimodal user experiences built with Gemini could look like. We made it to inspire developers,” said, Vinyals.
"When you’re building an app, you can get similar results (there’s always some variability with LLMs) by prompting Gemini with an instruction that allows the user to "configure" the behavior of the model, like inputting “you are an expert in science …” before a user can engage in the same kind of back and forth dialogue. Here’s a clip of what this looks like in AI Studio with Gemini Pro. We’ve come a long way since Flamingo 🦩 & PALI, looking forward to seeing what people build with it," the VP further added.
The original viral video narrated an evolving sketch of a duck from a squiggle to a completed drawing, which it says is an unrealistic colour, then evinces surprise (“What the quack!”) when seeing a toy blue duck. It then responds to various voice queries about that toy, then the demo moves on to other show-off moves, like tracking a ball in a cup-switching game, recognizing shadow puppet gestures, reordering sketches of planets, and so on.
The original viral video narrated a developing illustration of a duck from a squiggle to a completed drawing. To which Gemini responds that the duck is of an unrealistic colour, then exhibits surprise (“What the quack!”) when seeing a toy blue duck. It then answers to voice queries about that toy. The demo moves on to other show-off moves, like tracking a ball in a cup-switching game, identifying shadow puppet gestures, rearranging sketches of planets, and more.
However, the viral demo wasn’t carried out in real time or in voice. While speaking with Bloomberg Opinion, a Google spokesperson said it was made by “using still image frames from the footage, and prompting via text,” and they pointed to a site showing how others could converse with Gemini with photos of their hands, or of drawings or other objects. Making Gemini very different from what Google is trying to be suggest, “that a person could have a smooth voice conversation with Gemini as it watched and responded in real time to the world around it,” wrote, Parmy Olson in an opinion piece at Bloomberg.
It's not the first time Google's demo videos have been questioned. In the past, the tech giant faced doubts about the legitimacy of its Duplex demo, which featured an AI assistant making reservations at hair salons and restaurants.
During a demo, Google Duplex was shown to be able to make reservations at a restaurant, book hair appointments, and even book travel. Several journalists and experts concluded after Google demonstrated Google Duplex that the demonstration was not authentic but rather a set-up. The calls and tasks executed by Google Duplex were considered fake, according to various media reports.
The reason behind it being fake was that there was background noise made during the calls, among other suspicions.
Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
MoreLess