Google’s new AI tool Whisk uses images as prompts

Google hasn’t done that yet last Artificial intelligence tool To add to The pile. Whisk is a Google Labs image builder that lets you use an existing image as your prompt. But its output only captures the “essence” of the initial image rather than recreating it with new details. Therefore, it is better to brainstorm and do quick visualizations rather than editing the source image.

The company describes Whisk as “a new kind of creative tool.” The input screen starts with a simple interface containing inputs for style and theme. This simple introductory interface lets you choose from three preset styles: sticker, enamel pin, and plush. I suspect Google has found those three allow for the kind of approximate output that the demo tool is ideal for in its current form.

As you can see in the photo above, I have produced a solid image of a Wilford Brimley plush. (Google’s terms prohibit taking photos of celebrities, but Wilford snuck through the gates, carrying Quaker oats, without alerting the guards.)

Whisk also includes a more advanced editor (which can be found by clicking Start from Scratch from the home screen). In this mode, you can use the source text or image in three categories: subject, scene, and style. There is also an input bar to add more text for finishing touches. However, in its current form, the advanced controls did not produce results anything resembling my queries.

For example, check out my attempt to create the late Mr. Brimley in a stylized scene in the style of a stuffed walrus portrait I found online:

Screenshot of an AI generating tool producing images of a man who somewhat resembles Wilford Brimley. — Google/Screenshot by Will Shanklin for Engadget

Spitting out what looks like Wilford Brimley’s mysterious actor eating oatmeal inside a streamlined box frame. As far as I can tell, this guy is not a Belushi. So, it’s clear why Google recommends using the tool more for “quick visual exploration” and less production-ready content.

Google acknowledges that Whisk will only draw from “some basic properties” of your source image. “For example, the created subject may have a different height, weight, hairstyle, or skin tone,” the company warns.

To understand why, look no further than Google’s description of how Whisk works under the hood. is used Gemini language model To write a detailed caption for the source image you are uploading. It then enters this description into a file Imagen 3 image generator. Thus, the result is a built-in image Gemini talk about your image – Not the source image itself.

Whisk is only available in the US, at least for now. You can try it in the project Google Labs website.

https://s.yimg.com/os/creatr-uploaded-images/2024-12/6bfd07d0-bbee-11ef-bdf9-87cc622ef838

Source link

The worst English Premier League players ever

The news outlet says UnitedHealthcare is limited to “critical care” for children with autism

Leave a Comment Cancel reply