一歩のはまりから/ipponohamari

テキストから画像を生成する

AIネタ続きます。テキストから画像生成できるstable-diffusionを試してみます。オンラインでも使えますが、ローカル環境に環境構築しても動きます。

構築

参考リンク¹から関連ファイルをgit cloneすれば準備はOkです。それぞれの環境で足りないライブラリはpipする必要があります。各種エラー対処は、こちら²に先人がまとめてくれてますので参考に。

設定

GPUがあると描画は速いそうですが、持っていないのでGPU使わない設定が必要です。 webui-user.shを修正します。オプションを追記して、コメントアウトします。

#!/bin/bash
#########################################################
# Uncomment and change the variables below to your need:#
#########################################################

# Install directory without trailing slash
#install_dir="/home/$(whoami)"

# Name of the subdirectory
#clone_dir="stable-diffusion-webui"

# Commandline arguments for webui.py, for example: export COMMANDLINE_ARGS="--medvram --opt-split-attention"
#export COMMANDLINE_ARGS="--skip-torch-cuda-test --precision full --no-half --xformers"
export COMMANDLINE_ARGS="--skip-torch-cuda-test --precision full --no-half"

# python3 executable
#python_cmd="python3"

# git executable
#export GIT="git"

実行状況

実行すると、関連ファイルを引っ張ってきてwebui環境が待ち受けとなります。ブラウザーからhttp://127.0.0.1:7860にアクセスするとUI画面が出てきます。

kaji@trigkey:~/stable-diffusion-webui$ ./webui.sh 

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on kaji user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110]
Commit hash: a9fed7c364061ae6efb37f797b6b522cb3cf7aa2
Installing requirements for Web UI
Launching Web UI with arguments: --precision full --no-half
Warning: caught exception 'Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx', memory monitor disabled
No module 'xformers'. Proceeding without it.
Loading weights [6ce0161689] from /home/kaji/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
Creating model from config: /home/kaji/stable-diffusion-webui/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying cross attention optimization (InvokeAI).
Textual inversion embeddings loaded(0): 
Model loaded in 14.6s (load weights from disk: 0.2s, create model: 1.8s, apply weights to model: 11.8s, move model to device: 0.1s, hijack: 0.2s, load textual inversion embeddings: 0.2s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 33.1s (import gradio: 3.8s, import ldm: 2.0s, other imports: 3.5s, setup codeformer: 0.1s, load scripts: 0.6s, load SD checkpoint: 14.7s, create ui: 8.1s, gradio launch: 0.2s).
 65%|█████████████████████████████████████████████████████████████████████████▍                                       | 13/20 [09:12<04:54, 42.07s/it]
Total progress:  65%|███████████████████████████████████████████████████████████████                                  | 13/20 [08:23<04:52, 41.79s/it]

promptに「chair in the forest」と書いてgenerateした結果です。

動作環境は、

プロセッサ Intel(R) Celeron(R) N5100 @ 1.10GHz 1.10 GHz
実装 RAM 16.0 GB (15.8 GB 使用可能)

のWin11+WSL2(debian)上で構築しました。no GPUなので1枚の画像構築するのに15分程度かかります。

まとめ

ローカル環境にstable-diffusionを構築してテキストから画像作成を試してみました。テキストの書き方はいろいろとあるようです。no GPUだと1枚描くのに15分程かかったので、書き方の探求まで至ってません。引き続きAI環境のお試し続けたいと思います。

stable-diffusionを試してみました

テキストから画像を生成する

構築

設定

実行状況

まとめ

参考リンク