Consistency Stable Cascade, AI’s newest picture generating model, is expected to outperform its industry-leading predecessor, Stable Diffusion, which forms the foundation of numerous other text-to-image generation AI products, in terms of speed and power.
Stable Cascade can create images, produce variations of the exact image it made, or attempt to improve the resolution of an already-existing image. Additional text-to-image editing tools include canny edge, which allows users to create a new image by only utilising the edges of an existing one, and inpainting and outpainting, which allow the model to fill alter only a specified portion of the image.
The new model adds more possibilities even as corporations like Google and even Apple develop their own picture generating models. It is available on GitHub for researchers, but it is not for commercial usage.
Stable Cascade is made up of three separate models that rely on the Würstchen architecture, unlike Stability’s flagship Stable Diffusion models, which are one big language model. Text prompts are first compressed into latents, or smaller code segments, via stage C. These latents are then sent to stages A and B for request decoding.
Requests can be compressed to execute faster and take less memory (as well as less hours of training on those elusive GPUs) by breaking them up into smaller pieces. while exhibiting “both prompt alignment and aesthetic quality” improvements. In comparison to the current SDXL model, which takes 22 seconds to create a picture, it took roughly 10 seconds.
Further to making the stable diffusion method more widely known, Stability AI has been the target of multiple lawsuits that claim Stable Diffusion was trained on copyrighted material without the consent of the owners of the rights. Getty Images has filed a complaint in the UK against Stability AI, and it is set for trial in December. In December, it started charging a membership fee for commercial licences, which the company justified as being important to support its research.