Stable Diffusion Prompt Interpolation

Stable Diffusion is a powerful AI model that can generate high-quality images from text prompts. One of its most exciting capabilities is prompt interpolation, which allows smooth transitions between different prompts over multiple frames. This opens up creative possibilities for generating seamless AI animations and videos.

In this article, we will showcase prompt interpolation on Stable Diffusion through various examples and code snippets. We will cover:

  • What is prompt interpolation and how it works
  • Setting up the environment to use prompt interpolation
  • Basic interpolation between two prompts
  • Multi-prompt interpolation
  • Advanced techniques like circular walks

What Is Prompt Interpolation?

Prompt interpolation refers to linearly blending multiple text embeddings before feeding them into Stable Diffusion to generate images. By incrementally changing the mixing ratios of different prompts over successive frames, we can create smooth transitions between the corresponding images.

For example, if we have two prompts A and B, we can instruct Stable Diffusion to:

  • Generate an image for prompt A
  • Generate another image for prompt B
  • Calculate intermediate embeddings between A and B in small steps
  • Generate an image for each intermediate embedding

When these images are stitched together into a video, it results in a morphing effect between the A and B images.

Setting Up the Environment

To follow along with the examples below, you need:

  • Stable Diffusion model
  • Environment to load the model like Google Colab
  • Libraries to interface with Stable Diffusion like keras-cv

Here is some sample setup code:

# Enable mixed precision for faster processing
import keras_cv
from keras_cv import models

# Load Stable Diffusion model
model = models.stable_diffusion()

Now we are ready to try prompt interpolation.

Basic Prompt Interpolation

Let’s interpolate between two simple prompts:

prompt1 = "A red flower" 
prompt2 = "A green apple"

# Number of interpolation steps
steps = 10  

# Embed prompts
emb1 = model.text_encoder(prompt1) 
emb2 = model.text_encoder(prompt2)

# Interpolate embeddings
ratios = np.linspace(0, 1, steps)
embeds = []
for r in ratios:
    interp_emb = (1 - r) * emb1 + r * emb2 
    embeds.append(interp_emb)

# Generate images
images = model.decode(embeds)

This will generate a sequence of steps images slowly morphing between a red flower and a green apple!

Flower to apple interpolation

Multi-Prompt Interpolation

We can extend this technique to blend multiple prompts simultaneously:

prompts = ["A fantasy landscape", "A fantasy landscape with a castle", 
           "A futuristic city"]

# Generate embeddings 
embs = [model.text_encoder(p) for p in prompts]  

# Evenly spaced ratios
steps = 10
ratios = np.linspace(0, 1, steps) 

# Interpolate all pairs of embeddings
interp_embeds = []
for i in range(len(embs)-1):
    for r in ratios:
        interp_emb = (1 - r) * embs[i] + r * embs[i+1]
        interp_embeds.append(interp_emb)

images = model.decode(interp_embeds)  

Now we will morph between all three prompts evenly to create a dynamic fantasy landscape!

Advanced Technique: Circular Walk

We can also traverse the latent space in a circular path to discover unseen directions:

prompt = "A large galaxy in space"

# Generate base embedding
emb = model.text_encoder(prompt)  

# Parameters
steps = 100  
start, end = 0, 2*math.pi 

# Circle embeddings    
embeds = []
for t in np.linspace(start, end, steps):
    rotated_emb = rotate_embedding(emb, t) 
    embeds.append(rotated_emb)

images = model.decode(embeds)

This circular walk reveals novel interpretations of the initial galaxy prompt!

Conclusion

In this article, we explored several prompt interpolation techniques with Stable Diffusion code examples:

  • Basic linear interpolation between two prompts
  • Multi-prompt morphing across several concepts
  • Circular walks to discover new directions

Prompt interpolation unlocks new creative possibilities for AI video generation. You can build on these foundations to make interpolated music videos, dynamic concept transitions and more. The possibilities are endless!

Let me know in the comments if you have any other questions. Happy prompting!