[Stable Diffusion] Prompt Sharing and Learning Thread

fr34ky · Mar 15, 2023

Hey guys, does anybody know why once you get a CUDA out of memory error, you start to get it every time, even with lower resolutions, etc.?

I mean, sometimes I can create a picture in 1024x1024, and after a couple of generations I start to get that CUDA error and cannot go back to generate pictures in that resolution.

Is there some kind of VRAM reset or something that can be done?

edit:

Sepheyer said:
Sooo... ComfyUI doesn't have "restore face" feature. For the time being one can either use
You must be registered to see the links
or
You must be registered to see the links
or grab any other equivalent from GitHub. Naturally, rather messy.

Here we have before and after:

Original CodeFormer GFPGAN
View attachment 2469376 View attachment 2469377 View attachment 2469396

The prompt is inside the original file.

have you tried sending the txt2img result to inpaint and correct the face with a higher resolution? I mean instead of face restoration.

Mr-Fox · Mar 15, 2023

fr34ky said:
Hey guys, does anybody know why once you get a CUDA out of memory error, you start to get it every time, even with lower resolutions, etc.?

I mean, sometimes I can create a picture in 1024x1024, and after a couple of generations I start to get that CUDA error and cannot go back to generate pictures in that resolution.

Is there some kind of VRAM reset or something that can be done?

edit:

have you tried sending the txt2img result to inpaint and correct the face with a higher resolution? I mean instead of face restoration.

I have noticed this also, it seems to be something like a cashing issue. I have also been wondering if we can empty this cash somehow. I have not found anything about this yet. There's an

You must be registered to see the links

with tips and ideas about fixing cuda memory error. One thing is to use the

You must be registered to see the links

of SD for low vram. I haven't tried it.
Another is to add
"set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:64" to webui-user.bat .
I have tried this one and it extends your absolute limit for resolution and hi res multiplier a bit.
Restarting the webui maybe helps. You can do this from settings / Reload Ui. Then you can use PNG Info to load back the prompt and settings. Running other things that is demanding for the GPU is of course adding to the issue, I'm sure you already understand this. Photoshop is one example and other similar heavy software.
I don't understand why it leads to an error if the vram is "full" though. Why can't it just continue generating but only take longer time?

fr34ky · Mar 15, 2023

Mr-Fox said:
I have noticed this also, it seems to be something like a cashing issue. I have also been wondering if we can empty this cash somehow. I have not found anything about this yet. There's an
You must be registered to see the links
with tips and ideas about fixing cuda memory error. One thing is to use the
You must be registered to see the links
of SD for low vram. I haven't tried it.
Another is to add
"set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:64" to webui-user.bat .
I have tried this one and it extends your absolute limit for resolution and hi res multiplier a bit.
Restarting the webui maybe helps. You can do this from settings / Reload Ui. Then you can use PNG Info to load back the prompt and settings. Running other things that is demanding for the GPU is of course adding to the issue, I'm sure you already understand this. Photoshop is one example and other similar heavy software.
I don't understand why it leads to an error if the vram is "full" though. Why can't it just continue generating but only take longer time?

Yeah, usually the only solution is to restart, I've read that parameter you posted before and was thinking it could be related, couldn't try it yet.

All this stuff is in a very experimental phase IMO so probably there are lots of bad use of memory and resources.

Schlongborn · Mar 15, 2023

I posted in the AI art thread about training a LORA. I used this project:

You must be registered to see the links

which is like webui, but only for training. I found it easier to use then dreambooth (which also constantly broke my webui installation, so I uninstalled it). It is a dedicated thing just for training so you can have webui for the art, and kohya for the training.

For installing kohya_ss follow the instructions in the readme:

You must be registered to see the links

I assume most here will run windows. There are instructions for ubuntu too.

Most difficult thing was putting together a training set. I just used a series of Daz renders I have:

You must be registered to see the links

(I am actually too lazy to render more so I just keep reusing that one for my AI shenanigans since its kind of complete).

And then I cropped the images into "AI friendly" resolutions, so 512x512, 768x512, 1024x512 etc. I used mostly 512x512 pictures which is recommended, but it should be ok to use other resolution, even completely different ones from these "AI friendly" ones.

Kohya expects a certain directory structure for training, like this:

and then inside of "image" you need to create a directory like so:

That folder name is important like that, the 100 means how often each picture is sampled. So in my case I have 21 pictures, and 100 samples (or iterations) per picture, so the entire training will run for 2100 iterations. You can put higher values there if you have less images, youtube said around 1000-2000 iterations is enough.

and in there you can put all the images like so:

And then in kohya, you can caption those images using BLIP like so:

The prefix there is optional, that string is just inserted at the beginning of each generated .txt file. This is useful because then the LORA is trained to recognize that word, so when I put "SophiaCollegeLectureFace" in my prompt, or negative prompt, then I can control my LORA that way. I am not actually sure this really worked, I just used the LORA through webui's "additional networks" which just added something to the prompt anyways. But it didn't hurt either.

Ok, once you ran BLIP the folder should look like this:

I edited all of these files because for my training data it all came out pretty much similar. To be honest I think my training data was kinda shitty. But it is just an example anyways. I'll attach a zip with my training data and the LORA, so there you can take a look what I used, most of these .txt files contain something like: "SophiaLectureCollegeFace a woman with red hair with glasses and a necklace on her neck".

Ok, now that everything is setup, you can go into the "Dreambooth LORA" tab and setup the folders like so:

and also pick a source model like so:

I just picked a v1.5 model, if you want to train a LORA for v2.1 or v2.0, you need to tick the v2 checkbox, if you want to train for v2.1-768 you need to also tick v_parameterization. Since I use a custom checkpoint, the model quick pick is "custom", otherwise you can select the base 1.5, 2.0 or 2.1 models there. I just save as safetensors because thats the default.

And then finally you can fiddle with the training parameters:

this what I used for the basics, and the advanced stuff I pretty much left everything as default.

One important thing is probably the resolution, I don't think I messed with it (made these screenshots after the fact so don't remeber exactly), but I wouldn't change it too much either way. 512x512 or 768x768 is probably the way to go there with a 1.5 model, maybe 768x768 only really works with v2.1-768 even. Might also depend on your training data. If you have only 768x768 images then maybe thats the better choice.

The other important one is the "Enable buckets" checkbox. That makes it so that your training data is sorted into resolution "buckets", not entirely sure how that works but it is meant to make it so you can use images with differing resolutions, or resolutions that don't "fit" the model way in your training data without distorting the result. So I'd make sure it is enabled (I believe it costs VRAM though).

And now, you can press the big orange button!

And your training should start.

Code:

(venv) → D:\AI\kohya_ss [master ≡ +1 ~0 -0 !]› .\webui-user.bat
Already up to date.
Validating that requirements are satisfied.
All requirements satisfied.
Load CSS...
Running on local URL:  http://127.0.0.1:7862

To create a public link, set `share=True` in `launch()`.
Folder 100_SophiaCollegeLectureFace: 2100 steps
max_train_steps = 2100
stop_text_encoder_training = 0
lr_warmup_steps = 210
accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="D:/AI Models/comfyui/checkpoints/ProtoGen_X3.4.safetensors" --train_data_dir="D:/AI Training/SophiaCollegeLectureFace/image" --resolution=512,512 --output_dir="D:/AI Training/SophiaCollegeLectureFace/model" --logging_dir="D:/AI Training/SophiaCollegeLectureFace/log" --network_alpha="128" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=128 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="constant" --lr_warmup_steps="210" --train_batch_size="1" --max_train_steps="2100" --save_every_n_epochs="1" --mixed_precision="bf16" --save_precision="bf16" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW8bit" --bucket_reso_steps=64 --xformers --bucket_no_upscale

And, once it is all over, you have your model in the "model" folder. Here is my LORA + training data as an example:

You must be registered to see the links

You can just copy the .safetensors file into webui/models/lora, and then you can load it in webui with the "additional networks" button. It should insert something into your prompt like <SophiaCollegeLectureFace:1> where the 1 is a weight. I found 1 is way too much, maybe like 0.8 at most. Otherwise it resulted in kind of ugly pictures.

And if you want a youtube video instead, use this:

You must be registered to see the links

that's where I got my information from mostly.

Mr-Fox · Mar 15, 2023

fr34ky said:
Yeah, usually the only solution is to restart, I've read that parameter you posted before and was thinking it could be related, couldn't try it yet.

All this stuff is in a very experimental phase IMO so probably there are lots of bad use of memory and resources.

I'm not actually sure what it's doing. I understand though that it sets something related to how memory is used and handled.
It does work, but it only gives a small boost. At least as far as I have seen. I'm a bit "ham fisted" though with my vram usage.
I run hi res multiplier as high as my GPU will allow atm. If I get an error I usually adjust down 0.05-0.1 and it fixes it most of the time. It's funny though, sometimes I can reach higher end resolution than other times. Meaning I can run the hi res multiplier higher than usual at times. No idea why.

fr34ky · Mar 16, 2023

Mr-Fox said:
I'm not actually sure what it's doing. I understand though that it sets something related to how memory is used and handled.
It does work, but it only gives a small boost. At least as far as I have seen. I'm a bit "ham fisted" though with my vram usage.
I run hi res multiplier as high as my GPU will allow atm. If I get an error I usually adjust down 0.05-0.1 and it fixes it most of the time. It's funny though, sometimes I can reach higher end resolution than other times. Meaning I can run the hi res multiplier higher than usual at times. No idea why.

It's quite random, sometimes if the pictures needs that extra resolution you just have to insist until it comes right

I'm curious on why you use hi-res fix instead of sending the picture to img2img and then use a higher resolution there.

Is there anything I'm missing with that? (I've read your explanation and tested it)

Schlongborn said:
I posted in the AI art thread about training a LORA. I used this project:
You must be registered to see the links
which is like webui, but only for training. I found it easier to use then dreambooth (which also constantly broke my webui installation, so I uninstalled it). It is a dedicated thing just for training so you can have webui for the art, and kohya for the training.

Most difficult thing was putting together a training set. I just used a series of Daz renders I have:
You must be registered to see the links
(I am actually too lazy to render more so I just keep reusing that one for my AI shenanigans since its kind of complete).

And then I cropped the images into "AI friendly" resolutions, so 512x512, 768x512, 1024x512 etc. I used mostly 512x512 pictures which is recommended, but it should be ok to use other resolution, even completely different ones from these "AI friendly" ones.

Kohya expects a certain directory structure for training, like this:
View attachment 2470222
and then inside of "image" you need to create a directory like so:
View attachment 2470223

That folder name is important like that, the 100 means how often each picture is sampled. So in my case I have 21 pictures, and 100 samples (or iterations) per picture, so the entire training will run for 2100 iterations. You can put higher values there if you have less images, youtube said around 1000-2000 iterations is enough.

and in there you can put all the images like so:
View attachment 2470224

And then in kohya, you can caption those images using BLIP like so:
View attachment 2470231

The prefix there is optional, that string is just inserted at the beginning of each generated .txt file. This is useful because then the LORA is trained to recognize that word, so when I put "SophiaCollegeLectureFace" in my prompt, or negative prompt, then I can control my LORA that way. I am not actually sure this really worked, I just used the LORA through webui's "additional networks" which just added something to the prompt anyways. But it didn't hurt either.

Ok, once you ran BLIP the folder should look like this:
View attachment 2470245

I edited all of these files because for my training data it all came out pretty much similar. To be honest I think my training data was kinda shitty. But it is just an example anyways. I'll attach a zip with my training data and the LORA, so there you can take a look what I used, most of these .txt files contain something like: "SophiaLectureCollegeFace a woman with red hair with glasses and a necklace on her neck".

Ok, now that everything is setup, you can go into the "Dreambooth LORA" tab and setup the folders like so:
View attachment 2470226

and also pick a source model like so:
View attachment 2470249

I just picked a v1.5 model, if you want to train a LORA for v2.1 or v2.0, you need to tick the v2 checkbox, if you want to train for v2.1-768 you need to also tick v_parameterization. Since I use a custom checkpoint, the model quick pick is "custom", otherwise you can select the base 1.5, 2.0 or 2.1 models there. I just save as safetensors because thats the default.

And then finally you can fiddle with the training parameters:
View attachment 2470258
View attachment 2470264
this what I used for the basics, and the advanced stuff I pretty much left everything as default.

One important thing is probably the resolution, I don't think I messed with it (made these screenshots after the fact so don't remeber exactly), but I wouldn't change it too much either way. 512x512 or 768x768 is probably the way to go there with a 1.5 model, maybe 768x768 only really works with v2.1-768 even. Might also depend on your training data. If you have only 768x768 images then maybe thats the better choice.

The other important one is the "Enable buckets" checkbox. That makes it so that your training data is sorted into resolution "buckets", not entirely sure how that works but it is meant to make it so you can use images with differing resolutions, or resolutions that don't "fit" the model way in your training data without distorting the result. So I'd make sure it is enabled (I believe it costs VRAM though).

And now, you can press the big orange button!
View attachment 2470267

And your training should start.

Code:

(venv) → D:\AI\kohya_ss [master ≡ +1 ~0 -0 !]› .\webui-user.bat Already up to date. Validating that requirements are satisfied. All requirements satisfied. Load CSS... Running on local URL: http://127.0.0.1:7862 To create a public link, set `share=True` in `launch()`. Folder 100_SophiaCollegeLectureFace: 2100 steps max_train_steps = 2100 stop_text_encoder_training = 0 lr_warmup_steps = 210 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="D:/AI Models/comfyui/checkpoints/ProtoGen_X3.4.safetensors" --train_data_dir="D:/AI Training/SophiaCollegeLectureFace/image" --resolution=512,512 --output_dir="D:/AI Training/SophiaCollegeLectureFace/model" --logging_dir="D:/AI Training/SophiaCollegeLectureFace/log" --network_alpha="128" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=128 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="constant" --lr_warmup_steps="210" --train_batch_size="1" --max_train_steps="2100" --save_every_n_epochs="1" --mixed_precision="bf16" --save_precision="bf16" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW8bit" --bucket_reso_steps=64 --xformers --bucket_no_upscale

And, once it is all over, you have your model in the "model" folder. Here is my LORA + training data as an example:
You must be registered to see the links

You can just copy the .safetensors file into webui/models/lora, and then you can load it in webui with the "additional networks" button. It should insert something into your prompt like <SophiaCollegeLectureFace:1> where the 1 is a weight. I found 1 is way too much, maybe like 0.8 at most. Otherwise it resulted in kind of ugly pictures.

And if you want a youtube video instead, use this:
You must be registered to see the links
that's where I got my information from mostly.

That was a lot of work on that post man. I've use that Kohya to train my LORAs, it's the only way I could train LORA and works great.

Schlongborn · Mar 16, 2023

fr34ky said:
That was a lot of work on that post man.

Yeah at first I just wanted to post the link and my zipped training data, but I started explaining what I did and it kinda became a pretty big post.

Sepheyer · Mar 16, 2023

fr34ky said:
have you tried sending the txt2img result to inpaint and correct the face with a higher resolution? I mean instead of face restoration.

I think I came close to it, but not entirely. This girl link is a 2x upscale of a 512x768 original and produced using this link pipeline. The denoise setting was 0.5, I trust this is close enough to inpaint. So, if I wouldn't know better, the mere x2 upscale result is acceptable, although GFPGAN takes it to the next level link.

You don't have permission to view the spoiler content. Log in or register now.

Also, CUI doesn't really support inpaint as easily as WUIA1111 does. In CUI you need to do the mask in a graphics editor, connect it via a nod, etc. - so the workflow is high touch.

And finally, I have this strongly held belief, although it might be laughably incorrect, that face restoration algos are supreme to the alternative workflows, including rendering in higher resolution. Now, a higher resolution render does make GFPGAN produce a yet better result as the comparison linked above illustrates.

Hence I wanted to explore how I can get the proper face restoration, because otherwise CUI would be a non-starter without a way to do face restore.

Sepheyer · Mar 16, 2023

Guys, if you want me to pin something into the guides section, let me know. This is an open offer without expiration. If there are useful posts that are not in the Guides yet, it is only because I either overlooked them or forgot about them. Do ping me and let me know what should be added. I know I am missing a bunch on ControlNet, but out of energy right now to search the thread, my bad!

Sepheyer · Mar 16, 2023

GodOfPandas got this image link. Great image, my post here is not critisism, god forbid.

Up until like 24 hours ago I didn't know that the grain/staircase in the girl's face are actually artefacts produced by face restoration algo CodeFormer that A1111 uses by default.

There are ways to cure it - using GFPGAN.

Option 1
This option is online and is not limited by your card's VRAM. "Option 2" below uses A1111, simplier, but limited by your card's VRAM.

A super quick tutorial on how to:

You must be registered to see the links

. And the artefacts are gone once you run a GFPGAN on the original image.

Option 2
As simple as a Sunday morning, but limited by your card's VRAM. The spidergirl blewout my VRAM using A1111, so here is using a 768x512 image:

Option 3
Use the actual desktop GFPGAN. You merely need to "git clone" it, same what we did with A1111. The major benefit is this has light memory footprint and you can do larger files on your desktop in mere seconds; it chews on much larger files than A1111 - I guess that's because on start up A1111 reserves all the memory it needs for the model leaving very little memory for anything else. The dedicated face restorers, upscalers do not have the same appoach and thus your memory goes much further. I.e. I kept running out of memory when processing GodOfPandas's image in A1111, but had no issue with the desktop GFPGAN.

Here is the link:

You must be registered to see the links

And here is what the GFPGAN-processed image looks like:

Sepheyer · Mar 16, 2023

You don't have permission to view the spoiler content. Log in or register now.

Elefy · Mar 16, 2023

Sepheyer said:
You don't have permission to view the spoiler content. Log in or register now.

Looks like is time for your woman to have story and become character

Mr-Fox · Mar 16, 2023

fr34ky said:
It's quite random, sometimes if the pictures needs that extra resolution you just have to insist until it comes right

I'm curious on why you use hi-res fix instead of sending the picture to img2img and then use a higher resolution there.

Is there anything I'm missing with that? (I've read your explanation and tested it)

That was a lot of work on that post man. I've use that Kohya to train my LORAs, it's the only way I could train LORA and works great.

A while back I noticed that a beautiful image had lost all texture and details after using the upscale in extra tab. Since then I have been using hi res fix because it doesn't remove the details. SD is indeed in an experimental state, so it could have been something else that caused it. For now I'm using hi res fix until I see a better solution.

Sepheyer · Mar 16, 2023

Whoa, the power of pipeline-based workflow. The workspace looks like:

You don't have permission to view the spoiler content. Log in or register now.

Here is the write-up:

You must be registered to see the links

. Takes no time to run.

Mr-Fox · Mar 16, 2023

Original Generated Image	Upscaled in extra tab. It has higher resolution , yes but the skin is all smoothed out. The texture and fidelity has been lost.

I had similar issue with the pinup Betty image.

PandaRepublic · Mar 16, 2023

I need help. I followed the guide in this thread on how to train LORA but it does not train. This is what I get when I click "Train Model" and it just stops there. I'm new to this so I don't know if this screenshot helps at all.

Schlongborn · Mar 16, 2023

GodOfPandas said:
I need help. I followed the guide in this thread on how to train LORA but it does not train. This is what I get when I click "Train Model" and it just stops there. I'm new to this so I don't know if this screenshot helps at all.

The screenshot helps, it contains the error:
ValueError: bf16 mixed precision requires PyTorch >= 1.10 and a supported device.

So that means you either don't have PyTorch >= 1.10 installed (you can check with: pip show torch), or your GPU does not support mixed precision. What GPU do you have?

PandaRepublic · Mar 16, 2023

Schlongborn said:
The screenshot helps, it contains the error:
ValueError: bf16 mixed precision requires PyTorch >= 1.10 and a supported device.

So that means you either don't have PyTorch >= 1.10 installed (you can check with: pip show torch), or your GPU does not support mixed precision. What GPU do you have?

See also:
You must be registered to see the links

I have a 2060 Super, It's probably because I don't have PyTorch

Schlongborn · Mar 16, 2023

GodOfPandas said:
I have a 2060 Super, It's probably because I don't have PyTorch

Then you probably need to follow these instructions:

You must be registered to see the links

I'll add that in my post too.

Elefy · Mar 16, 2023

Mr-Fox said:
The texture and fidelity has been lost

Use img2img with SD upscale

you are used GFPGAN and it wash face and change color, nex time use Codeformer 0.7

Original	CodeFormer	GFPGAN
View attachment 2469376	View attachment 2469377	View attachment 2469396

[Stable Diffusion] Prompt Sharing and Learning Thread

Active Member

Well-Known Member

Active Member

Member

Well-Known Member

Active Member

Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Well-Known Member

Well-Known Member

Member

Member

Member

Member

Member