Google has unveiled a much improved second-generation version of its image-generating AI model, Imagen 2, that can create and edit images when given a text prompt.
The rival tool to OpenAI’s Dall-E and Midjourney will now be available to Google Cloud customers who use Vertex AI – a cloud-based machine learning platform that provides a workflow for building, training, and deploying machine learning models.
Google Unveils Imagen 2 With Ability to Render Logos and Translate Text to Seven Languages
According to Google, compared to the first-gen Imagen, the latest model has “significantly” improved in terms of image quality and has added new capabilities, such as the ability to render text and logos.
Text and logo generation essentially adds Imagen to the same bracket as other leading image-generation models in the industry, like DALL-E, Midjourney, and the recently launched Titan Image Generator by Amazon.
However, what makes Imagen stand out from its competitors is that it can render text in multiple languages, including English, Chinese, Hindi, Japanese, Korean, Portuguese, and Spanish. Google plans to add support for more global languages in 2024.
Another feature bespoke to Imagen 2 is that it can overlay logos in existing images.
Vishnu Tirumalasetty, the head of generative media products at Google, explained that the image-generating AI model can generate emblems, letter marks, and abstract logos, and also overlay them onto products like clothing, business cards, and other surfaces.
These techniques also enhance Imagen 2’s multilingual understanding, which the company says, allows the AI model to translate a prompt in one language to an output in another language. The image-generating algorithm is also able to translate texts contained in logos.
Imagen 2 also makes use of SynthID, a mechanism developed by DeepMind that can apply invisible watermarks to images created by the AI. Google says these watermarks can only be detected using a special company-provided tool that is not available to third parties. The tech behemoth claims Imagen 2’s watermarks are resilient to image editing tools such as compression, filters, and color adjustments.
Google Offers Indemnification Policy to Protect Imagen Users From Copyright Claims
Interestingly, Google has not revealed the data it used to train Imagen 2. The company is playing it safe by keeping quiet on the matter, unlike the approach it took with Imagen 1, where it disclosed using a version of the public LAION dataset to train the model.
However, the problem with LAION is that it could contain problematic content like private medical images, copyrighted artwork, and photoshopped celebrity pornographic material.
Regurgitation, a phenomenon where a generative AI model creates a mirror copy of an example image it was trained in, has become a major concern for corporate customers and creators.
A recent study showed that the first-gen Imagen was not immune to this issue and was generating pictures of real people, and copyrighted artwork when prompted in a specific manner.
Other image-generator AI model developers, like Stability AI and OpenAI, allow users to opt out of training datasets if they wish to do so. Meanwhile, Adobe and Getty Images have established compensation schemes for creators who make use of their AI models.