[Stable Diffusion] Prompt Sharing and Learning Thread

Jimwalrus

Active Member
Sep 15, 2021
902
3,341
Woops! :LOL: I never noticed it's spelled differently actually. But I also tried "hairy pussy" nevertheless, SD really seems to struggle with that.

Why would adding a block of color help the program more than painting the suggested area with inpaint?
Gives SD something to modify - means it doesn't need to drag it out of nowhere. You get results closer to what you want. You don't have to be any good at drawing, just a triangle in the colour you want is usually enough. You can also get away with a lower denoising strength too, which reduces the risk of eldritch horrors.
 
  • Like
Reactions: Mr-Fox and Sepheyer

Jimwalrus

Active Member
Sep 15, 2021
902
3,341
Woops! :LOL: I never noticed it's spelled differently actually. But I also tried "hairy pussy" nevertheless, SD really seems to struggle with that.

Why would adding a block of color help the program more than painting the suggested area with inpaint?
Careful with "hairy pussy" though - almost all cats are hairy after all!
 
  • Haha
  • Like
Reactions: Sepheyer and Mr-Fox

devilkkw

Member
Mar 17, 2021
305
1,040
After some try, and some months:oops:, i switched my train totally to kohya_ss.
I'm experimenting now and the power of lora training is really good, and faster. i reach about 2.8it/s on train, 3x fast than a1111 textual inversion.

By the way, i read all tutorial Mr-Fox suggest, and start experimenting.
So after some experiment i want to share some and discuss some training technique.

But first i show my process for training a shirt, this shirt:
You don't have permission to view the spoiler content. Log in or register now.


1)Dataset

In this lora i have 7 image

5 image are only shirt with different color (for make lora more various)

2 image with subject wearing this shirt.



2) Caption

After many test, the caption text is really important and keep it similar for every image do better result, also use captioned text in promt when you are in sd give you resul you want.

in my case caption text is:

cloth1 blue shirt with a cat eating ramen on white background

and same text changing only color of shirt for other 4 image like this.

AND for image with subject:

cloth1 a man wearing black shirt with a cat eating ramen



cloth1 is a word i used to specify this shirt, seem useless but i need test more.

3)Training
100 step trainining, 1epoch, adam8bit adapter, constant scheduler.

And some result test: (prompt: cloth1 a subject wearing black shirt with a cat eating ramen <lora:mylora:1>)
You don't have permission to view the spoiler content. Log in or register now.

3) Error
My error is not cleaning original image from watermark, keep in mind for next time.

Why train a clothes? I decide to train clothes because reading around i see peoples say is really difficult,so for me start from difficult is a way to understand training better.
And what you see here is better lora i achieved, but before this i made really bad lora for test, sometimes it ignore my prompt, sometimes it do nothing. So after many experiment i found better setting that give me result i want.But doing it better is possible.
Just experimenting and grow up my training knowledge.

4) Conclusion and question

Train in standard v1.5 sd models is what many tutorial say, but it don't give me good result. I trained all my lora on my personal checkpoint, and i get better result. for what i've see pruned model and layer error model are not good for train.
Also about step and epoch: 100 step/ 1epoch seem enough an give me balanced result, 2epoch result overtrained.
50 step/2 epoch seem the same at 100 step/1epoch. (i'm in right?)
I tested this setting also on people:
You don't have permission to view the spoiler content. Log in or register now.
You don't have permission to view the spoiler content. Log in or register now.

did she look similar to original?
have you suggestion for better training on people?
and some suggestion for training style?
 
  • Red Heart
Reactions: Mr-Fox and Sepheyer

hkennereth

Member
Mar 3, 2019
229
742
Yeah but objectively: What can A1111 do what comfyUI can't? (What would I be missing out on?)
I must know that before I can make a decision :D
Since A1111 was really the standard UI for all experimental work done with SD for "so long" (those long months, I mean, the whole tech is barely over a year old), it can still be the case that some cutting edge plugins and processes come out for A1111 first, but even that is starting to change since the public adoption of ComfyUI by the official SD team when SDXL came out.

But as someone who's rarely happy with the results one get from the original prompt generation, and always want to do a few more steps to get to what I'd call a final image, working with ComfyUI is just a dream come true. Comfy is faster, lighter, more versatile, and if your process requires taking each image and going through a dozen steps of improvements, detailing, upscaling, etc before you're finally happy with it, Comfy will allow you to set up all those steps ONCE, and then just generate good images every time, instead of manually moving through tabs for upscaling, inpainting, more upscaling, and on and on...
 

me3

Member
Dec 31, 2016
316
708
After some try, and some months:oops:, i switched my train totally to kohya_ss.
I'm experimenting now and the power of lora training is really good, and faster. i reach about 2.8it/s on train, 3x fast than a1111 textual inversion.

By the way, i read all tutorial Mr-Fox suggest, and start experimenting.
So after some experiment i want to share some and discuss some training technique.

But first i show my process for training a shirt, this shirt:
You don't have permission to view the spoiler content. Log in or register now.


1)Dataset

In this lora i have 7 image

5 image are only shirt with different color (for make lora more various)

2 image with subject wearing this shirt.



2) Caption

After many test, the caption text is really important and keep it similar for every image do better result, also use captioned text in promt when you are in sd give you resul you want.

in my case caption text is:

cloth1 blue shirt with a cat eating ramen on white background

and same text changing only color of shirt for other 4 image like this.

AND for image with subject:

cloth1 a man wearing black shirt with a cat eating ramen



cloth1 is a word i used to specify this shirt, seem useless but i need test more.

3)Training
100 step trainining, 1epoch, adam8bit adapter, constant scheduler.

And some result test: (prompt: cloth1 a subject wearing black shirt with a cat eating ramen <lora:mylora:1>)
You don't have permission to view the spoiler content. Log in or register now.
Your captioning is wrong for the images that's wearing the tshirt.
You need to tell the AI that the person is wearing the clothing, not a tshirt. So you need to say that the person is wearing your trigger word.
It's probably much better to just use images of ppl wearing the clothing as well, since it's very unlikely that you'll use it in any other situation.

3) Error
My error is not cleaning original image from watermark, keep in mind for next time.

Why train a clothes? I decide to train clothes because reading around i see peoples say is really difficult,so for me start from difficult is a way to understand training better.
And what you see here is better lora i achieved, but before this i made really bad lora for test, sometimes it ignore my prompt, sometimes it do nothing. So after many experiment i found better setting that give me result i want.But doing it better is possible.
Just experimenting and grow up my training knowledge.
Clothing shouldn't be hard to train as it's just a matter of having good images that remains the same. The tricky part comes in if there's a TYPE of clothing you want to train that has alot of variation. In your case it would be very simple as you want it to learn just one simple tshirt.
What would be trickier is to train something like Xmas jumpers where you'd have multiple variations of patterns you'd want the AI to both learn and be able to create variations of by itself.

4) Conclusion and question

Train in standard v1.5 sd models is what many tutorial say, but it don't give me good result. I trained all my lora on my personal checkpoint, and i get better result. for what i've see pruned model and layer error model are not good for train.
Also about step and epoch: 100 step/ 1epoch seem enough an give me balanced result, 2epoch result overtrained.
50 step/2 epoch seem the same at 100 step/1epoch. (i'm in right?)
Either your lora or the model used for the images has a horrible issue with oversaturating. It's very obvious in the image of the girl, but it's pretty clear in the other ones as well. Given the images below as well it seems to be the model that's performing very badly.

And no, steps (as in image count x number of repeats) and epoch values aren't interchangeable in that way.
50 steps x 10 epochs is in no way the same training as 100 steps x 5 epochs

There's a lot of things that happen at the end of each epoch, this get "reset/restarted", things the "written" etc
Not sure how to best explain this....
The more you cycle through your images per epoch the more detail gets picked up, that "learned detail" gets used for further learning, more "finetuning" in the next epoch. To either correct wrongs or improve.
While it might not be 100% accurate, "loss" can be considered a representation of difference between what the AI expected and the actual value.
So lower loss suggest the AI predicted things closer too what happened.

So for most things you want the majority of your total step count to come from images x repeats and relatively low epoch count. Mostly you probably should need more than 5-10 epochs, exception generally being style.

I tested this setting also on people:
You don't have permission to view the spoiler content. Log in or register now.
You don't have permission to view the spoiler content. Log in or register now.

did she look similar to original?
have you suggestion for better training on people?
and some suggestion for training style?
Read above about oversaturating
 

Sharinel

Active Member
Dec 23, 2018
508
2,103
Guys, I have this picture and I wanted to change the private part with inpaint and add public hair. But SD goes totally nuts and does weird stuff:

View attachment 2995125

This is what it generated.. I just added "public hair".

View attachment 2995127

Is it something with my settings?

View attachment 2995129
Yes there is something wrong with your settings. I know this because I made exactly the same mistake when I started out as well :)

The "Inpaint Area" section needs to have 'whole picture' selected, not 'only masked'. I couldn't get my head round why it didn't work for a while but basically the prompt at the top of the page is applying to the whole picture and you have then told it that it just needs to apply to the masked section (in Mask Mode area). That's why you are getting a tiny picture-within-a-picture, because it is trying to create a picture of "purple haired girl with huge boobs sitting on the road flashing her vagina" in just the few pixels you have masked.
Actually what I find works quite well is to delete the prompt entirely and type in what you want to see in the masked area - in this case "shaved vagina" or something similar. You don't need the full prompt in inpaint, just the parts that apply to the section you are inpainting.
 

Microtom

Well-Known Member
Sep 5, 2017
1,072
3,683
Oh man, I'd really love it if someone wants to make a simpsons sdxl lora. SDXL has some knowledge, but not enough. Croc has some nice high res work that can be used.

ComfyUI_00023_.png
 
  • Like
Reactions: devilkkw and Mr-Fox

Fuchsschweif

Active Member
Sep 24, 2019
961
1,515
Yes there is something wrong with your settings. I know this because I made exactly the same mistake when I started out as well :)

The "Inpaint Area" section needs to have 'whole picture' selected, not 'only masked'. I couldn't get my head round why it didn't work for a while but basically the prompt at the top of the page is applying to the whole picture and you have then told it that it just needs to apply to the masked section (in Mask Mode area). That's why you are getting a tiny picture-within-a-picture, because it is trying to create a picture of "purple haired girl with huge boobs sitting on the road flashing her vagina" in just the few pixels you have masked.
Actually what I find works quite well is to delete the prompt entirely and type in what you want to see in the masked area - in this case "shaved vagina" or something similar. You don't need the full prompt in inpaint, just the parts that apply to the section you are inpainting.

As far as I know "whole picture" does change the whole picture, while "only masked" restricts the change to the chosen area. I tried to take all prompts away and only wrote a single prompt but it still never did what I wished..
 
  • Like
Reactions: Mr-Fox

Sharinel

Active Member
Dec 23, 2018
508
2,103
As far as I know "whole picture" does change the whole picture, while "only masked" restricts the change to the chosen area. I tried to take all prompts away and only wrote a single prompt but it still never did what I wished..
Whole picture does change the entire picture, but as you have selected only to 'inpaint masked' then it only applies the change to the part that you have selected
 

Nano999

Member
Jun 4, 2022
154
69
You can either choose Doggettx or install xformers. i personally prefer xformers. The a1111 wiki isn't consistent with info on this so i'm not 100% sure, but if you want to install xformers you need to edit your webui-user.bat and add --xformers as shown in this screen View attachment 2936666 ,
it should install xformers automatically and then you can select xformers in cross attention optimization menu :)
And as i can see you have earlier version of torch. I have View attachment 2936676 torch 2.0.1. But this topic i'll leave to someone that is more capable than me, i'm just a noob
This actually helped!
At first I get a phyton error on launch, but solution was to delete venv folder and redownload it.

Now the speed of hi.res render is very fast, maybe even faster than before :ROFLMAO:?
 

Jimwalrus

Active Member
Sep 15, 2021
902
3,341
Whole picture does change the entire picture, but as you have selected only to 'inpaint masked' then it only applies the change to the part that you have selected
Exactly. In fact, if you have sufficiently frequent previews showing, you will actually see the changes it wants to make to the whole picture, which are then thrown out right at the end in favour of the original pixels (I panicked the first time, thought it was going to fuck the whole thing up!)

Or, if you have it set to "Only Masked" it will use the prompts for the masked area - so trim the prompts hugely i.e. just have "(dense pubic hair:1.5)" or similar as your entire prompt.
 

Fuchsschweif

Active Member
Sep 24, 2019
961
1,515
I only want to refine the picture on the left (upscale + more details) but SD keeps not only adding stuff, it also does many weird pictures into one. Any idea why?

1697123995690.png

1697124011234.png

Here is the PNG for analyzing the settings..

00004-1803002598.png
 

Sharinel

Active Member
Dec 23, 2018
508
2,103
I only want to refine the picture on the left (upscale + more details) but SD keeps not only adding stuff, it also does many weird pictures into one. Any idea why?

View attachment 3000238

View attachment 3000239

Here is the PNG for analyzing the settings..

View attachment 3000240
Denoizing strength. Try putting it down to 0.1 instead of 0.65. The closer to 1 you have the denoizing strength, the more it changes the picture. So when you are trying to change something that you have masked, you want 0.65 as you want it to change, but when you want to upscale it and not change much, 0.1 is much better.

TBH I don't even use that tab if I want to upscale, I use the extras tab - much quicker for me
 

hkennereth

Member
Mar 3, 2019
229
742
I only want to refine the picture on the left (upscale + more details) but SD keeps not only adding stuff, it also does many weird pictures into one. Any idea why?

View attachment 3000238

View attachment 3000239

Here is the PNG for analyzing the settings..

View attachment 3000240
Well, here are a few obvious things to change there. You can't really modify an image with img2img without a prompt, which you don't have. All the negative prompt does is tell Stable Diffusion what NOT to include on the new generated image, but it won't remove stuff from an existing source image, if that was your goal. You need to provide a very clear prompt describing exactly what you want the image to have. For upscaling I would usually just use the same prompt as you had to create the original image, but it can be adjusted if necessary.

As to why you're getting crazy new stuff on the image... it's the lack of a prompt, and also the high Denoising Strength value you have. The higher the denoising, the more you're allowing SD to ignore the source image, and pay attention only to the prompt. It's does more than that in reality, but for simplicity's sake you can think of denoising on img2img as that: zero will completely ignore the prompt and just copy the original image, 1 will ignore the source image and just follow the prompt. If you want something similar to the source you want to stick with values lower than 0.5. For general upscaling I would recommend staying below 0.3, but with the SD Upscale script it might be necessary to go even lower, like 0.15 to 0.2, to help avoid each chunk of the image generating a new version of the complete image.
 

hkennereth

Member
Mar 3, 2019
229
742
TBH I don't even use that tab if I want to upscale, I use the extras tab - much quicker for me
The Extras tab is good if all you want is a bigger image but it won't do much more than increase resolution, which is why it is so fast; if the original image for example didn't draw the eyes of a small character in the image, the new one won't have that either. If you want more detail on the image you need to re-render it, and img2img is the way to do that.
 
  • Like
Reactions: Mr-Fox

Fuchsschweif

Active Member
Sep 24, 2019
961
1,515
TBH I don't even use that tab if I want to upscale, I use the extras tab - much quicker for me
Someone wrote here that the extra tab only upscales without adding generative details, but I do first create the pictures low detail in bulk (for speed) and then upscale those that I like. I just wanted SD to "finish" the given picture.


You can't really modify an image with img2img without a prompt, which you don't have.
Ah pardon, that was my 2nd attempt. I tried it without because with the prompts I got even wilder results, so I tried to erase them as I only wanted SD to carve the given image out, not do new stuff on top of it.

I will try it later again with less denoise, but previously it worked well often with 0.65. SD did only change minor stuff.
 
  • Like
Reactions: Mr-Fox

Jimwalrus

Active Member
Sep 15, 2021
902
3,341
Someone wrote here that the extra tab only upscales without adding generative details, but I do first create the pictures low detail in bulk (for speed) and then upscale those that I like. I just wanted SD to "finish" the given picture.




Ah pardon, that was my 2nd attempt. I tried it without because with the prompts I got even wilder results, so I tried to erase them as I only wanted SD to carve the given image out, not do new stuff on top of it.

I will try it later again with less denoise, but previously it worked well often with 0.65. SD did only change minor stuff.
The idea of running off lots of non-upscaled images first to get somewhere near (and refine prompts), then upscale the best is a good idea. However, the trick is to regenerate them from scratch in txt2img (i.e. same seed and prompts etc.) as well, not just running what you've previously created through Extras or img2img.
Shouldn't take long, it's only going to add a few seconds to redo that bit of work, but means it does the whole 'generative within upscaling' thing.

Re your second point, 0.65 is waaay too high for anything in img2img other than maybe converting anime to photorealistic (or vice versa).
0.5 is probably about the limit, 0.2-0.35 a better recommendation. You seem to have previously got lucky!
 
  • Like
Reactions: Mr-Fox and Sepheyer

Fuchsschweif

Active Member
Sep 24, 2019
961
1,515
The idea of running off lots of non-upscaled images first to get somewhere near (and refine prompts), then upscale the best is a good idea. However, the trick is to regenerate them from scratch in txt2img (i.e. same seed and prompts etc.) as well, not just running what you've previously created through Extras or img2img.
Shouldn't take long, it's only going to add a few seconds to redo that bit of work, but means it does the whole 'generative within upscaling' thing.

Re your second point, 0.65 is waaay too high for anything in img2img other than maybe converting anime to photorealistic (or vice versa).
0.5 is probably about the limit, 0.2-0.35 a better recommendation. You seem to have previously got lucky!

This doesn't seem to work for me..

Here's the first pic I got:

00023-27356484.png

Now I wanted to upscale it the way you suggested, with still text2image. I did set denoise to 0.

This is the result:

00029-27356484.png

So suddenly there are extra hands in the back and a lot of details have changed. (also the quality is ass, given that I said 4x ultrasharp :unsure: )

1697137835198.png
 

Fuchsschweif

Active Member
Sep 24, 2019
961
1,515
And again suuuuper weird stuff. Got this one:
00045-1162633955.png

Now I wanted to upscale it, denoiser is set to 0,2 and same seed of course, this is what's currently in the making:

1697141045593.png


Why do I get so wildly different stuff out?