• Download and Watch without watermark videos -- (HERE)

  • Download Indiasocialbook Android Native App (HERE)

New Google AI tool Whisk prompts with images

Леонидас

Administrator
Staff member
Mar 26, 2022
11,357
316
83
California

View attachment 34546

You are able to remix source photos through the Labs experiment; nevertheless, it has difficulty dealing with complexity.​


Another artificial intelligence tool has been added to the collection by Google. Whisk is a picture generator developed by Google Labs that enables you to utilize an existing image as a topic for your assignment. However, its output simply preserves the "essence" of your initial image, rather than replicating it with fresh features. Because of this, it is more effective for rapid-fire visualizations and brainstorming than it is for editing the source image.

Whisk is referred as by the company as "a new type of creative tool." At the beginning of the input screen, there is a basic interface that includes inputs for various styles and subjects. You are only able to select from three predetermined types while using this straightforward starting interface: stickers, enamel pins, and plushies. In its current configuration, the experimental tool is most suitable for producing rough-outline outputs, and I have a feeling that Google discovered that these three allowed for those outputs.

The image that you see above demonstrates that it was able to make a clear representation of a plushie of Wilford Brimley. (Despite the fact that Google's terms of service prohibit taking images of celebrities, Wilford managed to sneak through the gates with Quaker Oats in his possession without sounding the alarm.)

An additional, more sophisticated editor is included in Whisk, which can be accessed by selecting "Start from scratch" on the main screen. Text or an image from the source can be used in this mode, and it can be used in three different categories: subject, scene, and style. For the final touches, there is also a text input area that allows you to add additional content. Nevertheless, the advanced controls, in their current configuration, did not generate results that were even somewhat similar to the queries that I had entered.

As an illustration, take a look at my attempt to create a lightbox scene depicting the late Mr. Brimley in the style of a walrus plushie image that I discovered on the internet:

View attachment 34545

What appears to be an actor with a hazy resemblance to Wilford Brimley eating oats inside of a lightbox frame is the result of whisk spit. That individual does not appear to be a plushie, as far as I can tell. Therefore, it is straightforward to understand why Google suggests using the tool more for "rapid visual exploration" and less for content that is ready for production.

Whisk will only pull from "a few key characteristics" of your source image, according to Google, which acknowledges this as a limitation. The business warns that the created person may have a different height, weight, hairdo, or skin tone. For instance, the subject may have a different skin tone.

You need not look any further than Google's explanation of how Whisk operates behind the scenes to comprehend the reason for this. Through the utilization of the Gemini language model, it composes a comprehensive caption for the source image that you supply. The Imagen 3 picture generator is then provided with that description after it has been fed. As a result, the image that is produced is not the original source image but rather an image that is based on Gemini's remarks about your image.

At least for the time being, Whisk is only accessible in the United States. You are able to test it out on the Google Labs website for the project.