Microsoft Releases Visual ChatGPT
Generating and editing images using ChatGPT
Microsoft has introduced a new model, named Visual ChatGPT, which combines ChatGPT with visual foundation models such as Transformers, ControlNet, and Stable Diffusion.
Not only can you import images and also generate new images, but you can also edit your images.
The purpose of Visual ChatGPT is simple. You can generate and modify images in a chat format, which creates a different type of user experience for working with AI generative images and art.
While we’ve used platforms and apps that are specific to images and art, this merges the concept of chat + image prompting, something that hasn’t really been explored yet.
Check out this demo….
Right now, you have to download Visual ChatGPT via a GitHub site, but I will suspect that we will see this in different user interfaces soon. Perhaps as a new feature in ChatGPT and exposed through its API.
There is also a hugging face option if you want to test it.
Here is the system architecture diagram…
For a deep dive, this is the paper that Microsoft released with the model.
With this feature, I’m now imagining how ChatGPT can incorporate so many more use cases that now include images.
It will be interesting to see if this type of user experience becomes the preferred option for mainstream users… using chat to generate all forms of media.
If you liked this article, throw out some Medium love… claps, comment, and be sure to follow.
You can also support my work on Medium by becoming a member using this referral link.