Close
Image News

Stability AI’s new⁤ image ⁤generation model: Stable Cascade

Stability AI's new image-generating AI model, Stable Cascade, promises faster and more powerful photo generation. With features like inpainting, outpainting, and canny edge, it's a game-changer in the AI media

Stability AI’s new⁤ image ⁤generation model: Stable Cascade
Javier Rodriguez
  • PublishedFebruary 15, 2024

Stability AI has recently ​unveiled its latest image generation model, Stable Cascade, which is expected⁤ to outperform ‍its predecessor, ⁣Stable Diffusion, in terms of speed and power. This new model offers a range of features, including⁣ the ability to generate photos and provide variations of the original image, as well as enhance the resolution of existing pictures.⁣ Additionally, it includes text-to-image editing capabilities such as inpainting, outpainting, and canny⁤ edge, allowing ⁤users to manipulate specific parts⁤ of an image‍ and create new photos based on existing ones.

Image generated with Stable Cascade of penguin in cafe
Prompt “Cinematic photo of an anthropomorphic penguin sitting in a cafe reading a book and having a coffee.” Image: Stability AI

This new model is currently available on GitHub for researchers, although it is not yet approved​ for commercial use. Despite the ‌release‌ of image generation models by tech giants like Google and Apple, Stability ​AI’s Stable Cascade offers‌ a unique ⁣set of features and capabilities.

Unlike Stability’s flagship Stable Diffusion models, Stable Cascade is not a single large language model. Instead, it consists of three distinct models that utilize the Würstchen architecture. The initial stage, stage C, compresses text prompts ⁤into latents, which are then decoded by stages A and B⁤ to fulfill the request.

Graphs ⁢of inference times for Stable Cascade
Comparison of inference time Stable Cascade v‌ other models Stability AI

By breaking down requests into smaller components, the model⁤ reduces the memory required and runs faster, ⁢resulting in improved prompt alignment and aesthetic quality. In fact, it takes only about⁤ 10 seconds to create ​an image, compared to the 22 seconds‌ required by the current SDXL model.

Javier Rodriguez
Written By
Javier Rodriguez

Javier Rodriguez is a distinguished Spanish journalist renowned for his profound interest in technology and artificial intelligence. With a career spanning several years, Rodriguez has established himself as a leading voice in the tech journalism landscape.