Archiv für den Monat: Februar 2023

Image tricks with stable diffusion and clip-interrogator

When I got stable-diffusion to work, I really wanted to see what the AI would do on its own, letting it describe a given picture and re-create the picture from this description. How detailed can or should a description be?

First install the „clip-interrogator“:

python3 -m pip install --upgrade pip setuptools wheel
sudo apt install -y rustc cargo
pip install clip-interrogator

And here’s the sample python file. I got an error when I didn’t reassign/clear the „pipe“.

import torch
from torch import autocast
from diffusers import StableDiffusionPipeline
from PIL import Image
from clip_interrogator import Config, Interrogator

print("### Starting Stable Diffusion Pipeline")
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
pipe.to("cuda")

prompt = "a dark environment, two warriors standing on a chess field with swords drawn"
# prompt = "a laptop sitting on top of a wooden table"
steps = 50
width = 512
height = 512

print("### Creating images with Stable Diffusion")
with autocast("cuda"):
  for i in range(1):
    output = pipe(prompt, width=width, height=height, num_inference_steps=steps)
    image = output["images"][0]
    file = prompt.replace(" ", "_").replace(",", "")
    image.save(f"{file}-{i}.png")

pipe = "" # Destroy?
print("### Initializing Interrogator")
ci = Interrogator(Config(clip_model_name="ViT-L-14/openai"))
file = prompt.replace(" ", "_").replace(",", "")
for i in range(1):
  print("Loading file ", i)
  image = Image.open(f"{file}-{i}.png").convert('RGB')
  print(ci.interrogate(image))

Stable Diffusion on WSL on Windows

First install WSL2 as suggested by the many different websites out there. I installed it like this:

wsl --install -d Ubuntu-22.04

Open the installed Ubuntu and install the necessary packages for stable diffusion itself:

sudo apt update && sudo apt -y upgrade && sudo apt -y install git-lfs python3-pip
pip install torch --extra-index-url https://download.pytorch.org/whl/cu117
pip install diffusers transformers==4.26 scipy ftfy accelerate

Add CUDA-Support for the WSL (from this site):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda

Write yourself a small python file (e.g. as main.py) which will enable the image generation and save the images to a file with the prompt name:

import torch
from torch import autocast
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
pipe.to("cuda")

prompt = "a dark environment, two warriors standing on a chess field with swords drawn"
steps = 50
width = 512
height = 512

with autocast("cuda"):
  for i in range(4):
    output = pipe(prompt, width=width, height=height, num_inference_steps=steps)
    image = output["images"][0]
    file = prompt.replace(" ", "_").replace(",", "")
    image.save(f"{file}-{i}.png")