- Google admits it faked the live demonstration video showcasing the prowess of its generative AI model Gemini.
- The company edited the clip to speed up Gemini’s responses and added a conversation between the AI and its human user that did not take place.
- Experts are now questioning whether Gemini is ready for public use.
Just days after unveiling Gemini, its most powerful generative AI model to date, Google has been accused of lying about its performance to users.
As per a Bloomberg opinion piece, the tech giant admitted to manipulating a promotional video showing the strengths and capabilities of its latest large language model (LLM).
Google Misrepresented Gemini’s Multimodal Abilities By Manipulating Demo Video
The six-minute video titled “Hands-on with Gemini: Interacting with multimodal AI” shown during the Gemini announcement earlier this week displayed the remarkable abilities of the AI.
Google proudly touted the LLM’s multimodal capabilities that allow it to understand spoken conversational prompts along with image recognition. Gemini seemed to quickly recognize images, respond to related queries in seconds, and play a cup and ball game where it was able to track the wad of paper in real-time.
The video also showed Gemini was able to recognize a rubber duck held in a person’s hand within seconds and could tell what a duck is called in different languages.
The demo was all too impressive because this is an AI model that was answering by recognizing the event and predicting what could happen next. The speed at which Gemini was able to comprehend what the human was doing and to update and keep adding information on the get-go was mind-blowing.
But, turns out all that was faked. How disappointing. The video’s YouTube description read, “For this demo, latency has been reduced, and Gemini outputs have been shortened for brevity.”
Google not only edited the clip to speed up Gemini’s outputs but the voice interaction between the AI and the human user never happened. The live demonstration shown in the video was made using still image frames from the footage and Gemini was prompted via text, rather than having the model respond to, or predict, drawings or changes of objects on the table in real time.
To make things even worse, there was no disclaimer suggesting that the video was misleading to viewers. The event now raises questions about Gemini’s readiness for public use.
Related Readings: Alphabet Shares Soar On The Back Of AI Model Gemini’s Arrival
Google Denies Any Wrongdoing and Claims the Video was Made to Inspire Developers
Google denied any wrongdoing regarding the matter and referred to an X post by Orio Vinyals, the VP of research and deep learning at Google Deepmind – the co-creators of Gemini, that said all the user prompts and outputs shown in the video are real but shortened for brevity.
He added that the video shows what multimode user experiences “built with Gemini” could look like, suggesting it was made to “inspire developers”. Vinyals also noted that the team gave Gemini images and texts and asked it to respond by predicting what would happen next.
Is Gemini Ready To Compete Against OpenAI’s ChatGPT?
The Bloomberg op-ed said Google was simply showboating to mislead people about the fact that Gemini still lagged behind its arch-rival ChatGPT from OpenAI.
This is not the first time Google has duped video presentations. During the I/O 2018, the company showed a working demo of the Google Duplex – its AI voice assistant that mimics human voice and can make phone calls on the user’s behalf.
While presenting on stage, Alphabet and Google CEO Sundar Pichai used Duplex to make reservations at a salon and restaurant. The demo showed how seamless the model was but it raised further questions about its efficiency.
A few days after the I/O, it was found that the Duplex demo was staged as the calls were pre-recorded. At the event, Pichai implied that the calls were real and no Google representative had contacted the businesses beforehand to alert them about automated calls from an AI.
It is not out of line to edit demo videos, and a lot of brands do the same as it is needed to instill interest and excitement in a new product. But the problem here is that Google misled users by misrepresenting Gemini’s capabilities, which is a shame considering the hype and importance the model has for the industry.
The wisest thing for Google to have done is at least add a disclaimer stating how the video has been manipulated or allow journalists and developers to play with Gemini in a public beta to test out its so-called multimodal performance.