DALL·E 2 vs Midjourney vs Stable Diffusion
Comparison between most popular AI art generation tools
Text-to-image generation was there for quite some time now. Initially, these were started with the evolvement of generative models such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). You can read more about GANs here.
If we look into the broader domain the text-to-image models combine both Computer Vision (CV) and Natural Language Processing (NLP) subdomains.
If we look at these models closely DALL·E 2 is not open to the public. But you can join the program by request from here. On the other hand, Midjourney gives its service through its discord channel. Both of these are not open source and they will remain as that in the future. Stable Diffusion claims to be an open source model and you can find the online workplace to work with as well as google collab notebooks to use this model. These models used a considerable amount of images, and texts to train, and the inner workings of these models will be discussed in another article. In this time we will compare some given prompts and how each model reacts to them.
The below prompts help to understand the capabilities of each model.
- Epic style of katsuhiro otomo, wide view, filmed in amazing cinematic light, epic cyberpunk night background, 8k, high resolution, ultrarealistic, photorealistic, intricate, insanely detailed, octane render, unreal engine 5
2. A very detailed surreal photo of a ninja fighting a dragon spitting fire
3. Rainy train station, noir style, 3dsmax + vray render, extremly detailed, ultra realistic, unreal engine 5
4. London, zombie apocalypse, extreme detail, horror
5. A beautiful Sri lankan woman wearing traditional clothes half immerged in the Ganges river looking at the camera with an hypnotizing glare. CANON Eos C300, ƒ4, 15mm, natural lights
After playing with these three tools my observation is that DALL·E 2 has the ability to work well on natural human images. Midjourney had rich color and realistic images in all attempts. But we need to keep in mind that both of these models are not free. Stable Diffusion has large community support with its open-source nature. Due to this, we can see more advances in this one in the upcoming days.
Even today with the popularity of ChatGPT and similar tools we can see artists using image-generative tools alongside them to create wonders. Further going forward these kinds of conversational AI models will also challenge popular search engines such as Google.
Nowadays digital comics as well as digital arts taking a new leaf due to these tools and looking forward to seeing what's holds next.
If you like to read similar articles and get notified soon join medium using this link.