[Stable Diffusion] Prompt Sharing and Learning Thread

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
Whole picture does change the entire picture, but as you have selected only to 'inpaint masked' then it only applies the change to the part that you have selected
Exactly. In fact, if you have sufficiently frequent previews showing, you will actually see the changes it wants to make to the whole picture, which are then thrown out right at the end in favour of the original pixels (I panicked the first time, thought it was going to fuck the whole thing up!)

Or, if you have it set to "Only Masked" it will use the prompts for the masked area - so trim the prompts hugely i.e. just have "(dense pubic hair:1.5)" or similar as your entire prompt.
 

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,143
1,954
I only want to refine the picture on the left (upscale + more details) but SD keeps not only adding stuff, it also does many weird pictures into one. Any idea why?

1697123995690.png

1697124011234.png

Here is the PNG for analyzing the settings..

00004-1803002598.png
 

Sharinel

Active Member
Dec 23, 2018
598
2,509
I only want to refine the picture on the left (upscale + more details) but SD keeps not only adding stuff, it also does many weird pictures into one. Any idea why?

View attachment 3000238

View attachment 3000239

Here is the PNG for analyzing the settings..

View attachment 3000240
Denoizing strength. Try putting it down to 0.1 instead of 0.65. The closer to 1 you have the denoizing strength, the more it changes the picture. So when you are trying to change something that you have masked, you want 0.65 as you want it to change, but when you want to upscale it and not change much, 0.1 is much better.

TBH I don't even use that tab if I want to upscale, I use the extras tab - much quicker for me
 

hkennereth

Member
Mar 3, 2019
237
775
I only want to refine the picture on the left (upscale + more details) but SD keeps not only adding stuff, it also does many weird pictures into one. Any idea why?

View attachment 3000238

View attachment 3000239

Here is the PNG for analyzing the settings..

View attachment 3000240
Well, here are a few obvious things to change there. You can't really modify an image with img2img without a prompt, which you don't have. All the negative prompt does is tell Stable Diffusion what NOT to include on the new generated image, but it won't remove stuff from an existing source image, if that was your goal. You need to provide a very clear prompt describing exactly what you want the image to have. For upscaling I would usually just use the same prompt as you had to create the original image, but it can be adjusted if necessary.

As to why you're getting crazy new stuff on the image... it's the lack of a prompt, and also the high Denoising Strength value you have. The higher the denoising, the more you're allowing SD to ignore the source image, and pay attention only to the prompt. It's does more than that in reality, but for simplicity's sake you can think of denoising on img2img as that: zero will completely ignore the prompt and just copy the original image, 1 will ignore the source image and just follow the prompt. If you want something similar to the source you want to stick with values lower than 0.5. For general upscaling I would recommend staying below 0.3, but with the SD Upscale script it might be necessary to go even lower, like 0.15 to 0.2, to help avoid each chunk of the image generating a new version of the complete image.
 

hkennereth

Member
Mar 3, 2019
237
775
TBH I don't even use that tab if I want to upscale, I use the extras tab - much quicker for me
The Extras tab is good if all you want is a bigger image but it won't do much more than increase resolution, which is why it is so fast; if the original image for example didn't draw the eyes of a small character in the image, the new one won't have that either. If you want more detail on the image you need to re-render it, and img2img is the way to do that.
 
  • Like
Reactions: Mr-Fox

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,143
1,954
TBH I don't even use that tab if I want to upscale, I use the extras tab - much quicker for me
Someone wrote here that the extra tab only upscales without adding generative details, but I do first create the pictures low detail in bulk (for speed) and then upscale those that I like. I just wanted SD to "finish" the given picture.


You can't really modify an image with img2img without a prompt, which you don't have.
Ah pardon, that was my 2nd attempt. I tried it without because with the prompts I got even wilder results, so I tried to erase them as I only wanted SD to carve the given image out, not do new stuff on top of it.

I will try it later again with less denoise, but previously it worked well often with 0.65. SD did only change minor stuff.
 
  • Like
Reactions: Mr-Fox

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
Someone wrote here that the extra tab only upscales without adding generative details, but I do first create the pictures low detail in bulk (for speed) and then upscale those that I like. I just wanted SD to "finish" the given picture.




Ah pardon, that was my 2nd attempt. I tried it without because with the prompts I got even wilder results, so I tried to erase them as I only wanted SD to carve the given image out, not do new stuff on top of it.

I will try it later again with less denoise, but previously it worked well often with 0.65. SD did only change minor stuff.
The idea of running off lots of non-upscaled images first to get somewhere near (and refine prompts), then upscale the best is a good idea. However, the trick is to regenerate them from scratch in txt2img (i.e. same seed and prompts etc.) as well, not just running what you've previously created through Extras or img2img.
Shouldn't take long, it's only going to add a few seconds to redo that bit of work, but means it does the whole 'generative within upscaling' thing.

Re your second point, 0.65 is waaay too high for anything in img2img other than maybe converting anime to photorealistic (or vice versa).
0.5 is probably about the limit, 0.2-0.35 a better recommendation. You seem to have previously got lucky!
 
  • Like
Reactions: Mr-Fox and Sepheyer

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,143
1,954
The idea of running off lots of non-upscaled images first to get somewhere near (and refine prompts), then upscale the best is a good idea. However, the trick is to regenerate them from scratch in txt2img (i.e. same seed and prompts etc.) as well, not just running what you've previously created through Extras or img2img.
Shouldn't take long, it's only going to add a few seconds to redo that bit of work, but means it does the whole 'generative within upscaling' thing.

Re your second point, 0.65 is waaay too high for anything in img2img other than maybe converting anime to photorealistic (or vice versa).
0.5 is probably about the limit, 0.2-0.35 a better recommendation. You seem to have previously got lucky!

This doesn't seem to work for me..

Here's the first pic I got:

00023-27356484.png

Now I wanted to upscale it the way you suggested, with still text2image. I did set denoise to 0.

This is the result:

00029-27356484.png

So suddenly there are extra hands in the back and a lot of details have changed. (also the quality is ass, given that I said 4x ultrasharp :unsure: )

1697137835198.png
 

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,143
1,954
And again suuuuper weird stuff. Got this one:
00045-1162633955.png

Now I wanted to upscale it, denoiser is set to 0,2 and same seed of course, this is what's currently in the making:

1697141045593.png


Why do I get so wildly different stuff out?
 

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
And again suuuuper weird stuff. Got this one:
View attachment 3001083

Now I wanted to upscale it, denoiser is set to 0,2 and same seed of course, this is what's currently in the making:

View attachment 3001084


Why do I get so wildly different stuff out?
Very odd.
Your first one should have only given you the same image, simply upscaled (denoising was set to 0)

The second should only have made very small changes as your denoising was set to 0.02, not the 0.2 you stated.

I'll take a look at the gen data in PNGInfo later this afternoon when I get a chance to fire up my PC, maybe run some tests, see what's going on.
 
  • Like
Reactions: Fuchsschweif

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
And again suuuuper weird stuff. Got this one:
View attachment 3001083

Now I wanted to upscale it, denoiser is set to 0,2 and same seed of course, this is what's currently in the making:

View attachment 3001084


Why do I get so wildly different stuff out?
OK, for the second one first:

The original image was already upscaled to 1024x1024, but using None as Upscaler, with a Denoising strength of 0.7. It comes out with the first image, but when you reran it you used completely different settings. For that re-run you were generating at 1024x1024 to begin with, then upscaling it to 2048x2048 using 4xUltrasharp.
The way SD works is to treat each block of 512x512 (or part thereof) as one image then stitch them together. So your second image is actually 4 generations at once, stuck together as best it can. No wonder it's an eldritch horror!

What you want to do is generate at 512x512 (or 512xwhatever to give a more appropriate aspect ratio for the subject) without ANY HiResFix at all.

Once again: Run a load of initial gens without HiResFix, select the seeds you like the best (and prompts if you're tweaking those too), then rerun them from scratch with the settings as previously recommended: i.e. Original resolution (512xWhatever), ESRGAN_4x as your Upscaler, Denoising strength at 0.2 to 0.35, HiRes steps at least 1.5x the number of generation steps.

N.B. You do NOT need to get the image to the final desired resolution at this stage, you can make it bigger and sharper much more quickly using the Upscalers in Extras.

Once you've got the basics, then it's time to experiment to find the settings you like the best.

Also, Denoising strength greatly affects how much vRAM you use - with a 1070 I'll pretty much guarantee you won't be able to HiResFix images to 2048x2048 in a one-er with a worthwhile denoising strength (and you shouldn't even need to try). Make them at 512x512 then HiResFix upscale to 832x832 or similar - keep an eye on Task Manager to see your vRAM usage.

Take great care with the parameters you choose - you seem to be putting some weird settings in some places.
 
Last edited:

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
For the first one (sorry to do these in reverse order):

Your denoising strength was 0 - basically a waste of time, but it should have just given you the original image without changes other than it being four times the number of pixels.
Looks to me like it's one of those odd things that SD sometimes does.

Also, for me, 4xUltrasharp is not the best for HiResFix. It's good for upscaling in Extras, but I've always been disappointed with it in HiResFix
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
You need to do things in a methodical and step by step way.

I have been skim reading through all the posts and you are shotgunning things too much and jumping around from topic to topic. Settle down and take a deep breath. Now start from the beginning. If the image you get in txt2img is distorted or deformed there's no point in moving forward. It would require way too much fixing with inpaint and is simply not worth the hazzle or time. Instead you need to figure out why you get bad image in the first place, adjust the prompt and settings.

Yes do normal generations first, meaning no hiresfix. Then when you get an image that is not distorted re-use the same seed and prompt but activate hiresfix, this will do the upscaling and ad more detail. This is enough most of the time. If you want or need to fix a minor detail after this then you go to inpaint and fix it. If after it's fixed it doesn't have a nice transition, meaning it looks copy/pasted you do an img2img generation with a very low denoising strength setting. This will make the hand for instance that you fixed look natural and not copy/pasted.

When you ask for help, it's no use to post the distorted image that you got after a botched img2img or inpaint generation. Give us the image from txt2img instead. It's impossible to fix that image with floating head etc.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
For your first project with SD start way more simple. Don't do a complicated pose and with fingering and/or other interactions, these are very complicated and difficult even for an experienced creator.
Start with the most very basic. "A beautiful woman standing" . That's it.. If you can't get the most simple right how are you going to be able to do anything more complicated?..
Generate the first image. What's wrong with it? write these things in the negative prompt, the things you don't want.
Typically distorted hand, fused fingers, extra fingers/hands/leg etc.
Then generate again. Adjust the settings, the amount of steps cfg etc. Generate again... Be methodical.. Step by step.

When you have an image that looks good you activate hiresfix. Press the green recycle button to re-use the seed. Set the upscaler to 4x ultrasharp and the denoising to 0.2-0.4 . I would recomend 0.3 . Then generate. Now hopefully you have something decent. Post it and be happy. Next time something slightly more challenging, maybe a specific clothing or a slightly more complicated pose, still standing. Lying down or sitting is more complicated and difficult to get right and not something you should attempt until you have at least a few more projects completed.
 

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,143
1,954
For your first project with SD start way more simple. Don't do a complicated pose and with fingering and/or other interactions, these are very complicated and difficult even for an experienced creator.
Start with the most very basic. "A beautiful woman standing" . That's it.. If you can't get the most simple right how are you going to be able to do anything more complicated?..
Who said I can't get the most simple? I can easily generate "a beautiful woman standing". I am posting the things here I actually have problems with because that's the next step I want to reach. I already told you that I'm experienced when it comes to prompt-engineering since I've been using MJ for a long time. My struggle currently is to understand how to get the best out of the upscalers, so that I have good looking, sharp and detailled results!

If the image you get in txt2img is distorted or deformed there's no point in moving forward. It would require way too much fixing with inpaint and is simply not worth the hazzle or time.
What if I like everything on the picture except a single thing? Then that's where inpaint comes into play.. this is the whole appeal of it, isn't it? I got a nice pose, nice face, angle, everything, but let's say a hand is off, or a boob or a foot. Then I can fix that little detail with inpaint instead of rolling the dice 10 times again trying to be lucky so that SD doesn't mess anything up.

I think that's a way more precise way of working rather than just re-generating until something good comes out by luck.


Generate the first image. What's wrong with it? write these things in the negative prompt, the things you don't want.
Typically distorted hand, fused fingers, extra fingers/hands/leg etc.
Then generate again. Adjust the settings, the amount of steps cfg etc. Generate again... Be methodical.. Step by step.
SD sometimes simply ignores these negative prompts. In my negative prompts are things like:

"crooked fingers, weird hands, ugly hands, unproportional hands, more than 5 fingers per hand" and so on, I gave them weight, braces, but SD sometimes still messes these things up. I know how to use negative prompts, that was something that I used a lot on MJ too.

But when SD messes up the hands still and I got a picture that's close-to-be-perfect, then inpaint should provide the little fix I am searching for.. that's why I've been asking about the inpaint settings. I can already easily generate normal stuff, my next step up is to

a) learn proper use of inpaint

b) learn how SD processes upscaling


According to a post above, SD allegedly upscaled my picture already in the making while I had the upscaler set to "none". So I have to look if I have to turn other faders down to 0 despite having specified that I don't want to upscale..
 
Last edited:

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
Who said I can't get the most simple? I can easily generate "a beautiful woman standing". I am posting the things here I actually have problems with because that's the next step I want to reach. I already told you that I'm experienced when it comes to prompt-engineering since I've been using MJ for a long time. My struggle currently is to understand how to get the best out of the upscalers, so that I have good looking, sharp and detailled results!



What if I like everything on the picture except a single thing? Then that's where inpaint comes into play.. this is the whole appeal of it, isn't it? I got a nice pose, nice face, angle, everything, but let's say a hand is off, or a boob or a foot. Then I can fix that little detail with inpaint instead of rolling the dice 10 times again trying to be lucky so that SD doesn't mess anything up.

I think that's a way more precise way of working rather than just re-generating until something good comes out by luck.




SD sometimes simply ignores these negative prompts. In my negative prompts are things like:

"crooked fingers, weird hands, ugly hands, unproportional hands, more than 5 fingers per hand" and so on, I gave them weight, braces, but SD sometimes still messes these things up. I know how to use negative prompts, that was something that I used a lot on MJ too.

But when SD messes up the hands still and I got a picture that's close-to-be-perfect, then inpaint should provide the little fix I am searching for.. that's why I've been asking about the inpaint settings. I can already easily generate normal stuff, my next step up is to

a) learn proper use of inpaint

b) learn how SD processes upscaling


According to a post above, SD allegedly upscaled my picture already in the making while I had the upscaler set to "none". So I have to look if I have to turn other faders down to 0 despite having specified that I don't want to upscale..
Nothing "alleged" about it, it did! This showed in the PNGInfo. Also, the image was 1024x1024. Had it not been upscaled it would have been 512x512.
The Upscaler used was None, but it still upscaled it. To switch the HiResFix off, you need to set the ratio to 1, Upscaler to None and the Denoising strength to 0. It used to be a tick box, but it seemed to untick/retick itself randomly, so I guess it was junked around 1.5.0.

We're not meaning to be condescending, none of us here are experts - the mistakes you're making are very basic ones that we all made during the early stages. Experience with MJ doesn't help much there, even sometimes with prompting as SD works differently under the hood.

Hands & feet are SD's real weakness, negative prompts are not really able to fix much with them. There are some negative embeddings available on Civitai that can help, as can ADetailer. Unfortunately though, some inpainting may be required for an image that is otherwise perfect but has an extra finger etc.
 

devilkkw

Member
Mar 17, 2021
323
1,093
Your captioning is wrong for the images that's wearing the tshirt.
You need to tell the AI that the person is wearing the clothing, not a tshirt. So you need to say that the person is wearing your trigger word.
It's probably much better to just use images of ppl wearing the clothing as well, since it's very unlikely that you'll use it in any other situation.
Thank for your answer, so you think for example a good captioning for is:
man wearing blue cloth1 ?

Either your lora or the model used for the images has a horrible issue with oversaturating. It's very obvious in the image of the girl, but it's pretty clear in the other ones as well. Given the images below as well it seems to be the model that's performing very badly.
The over saturation is because i used cfg 22. i know is high cfg and get wrong result, but on some lora i've downloaded going over 15 get not image as result, but only crap colors like stop generation at step 1, so i tested at high cfg to see if lora i've made get result or have same problem i described.

And no, steps (as in image count x number of repeats) and epoch values aren't interchangeable in that way.
50 steps x 10 epochs is in no way the same training as 100 steps x 5 epochs



There's a lot of things that happen at the end of each epoch, this get "reset/restarted", things the "written" etc
Not sure how to best explain this....
The more you cycle through your images per epoch the more detail gets picked up, that "learned detail" gets used for further learning, more "finetuning" in the next epoch. To either correct wrongs or improve.
While it might not be 100% accurate, "loss" can be considered a representation of difference between what the AI expected and the actual value.
So lower loss suggest the AI predicted things closer too what happened.

So for most things you want the majority of your total step count to come from images x repeats and relatively low epoch count. Mostly you probably should need more than 5-10 epochs, exception generally being style.
Currently tested it, and with 5 epoch i reach loss 0.05. Seem good but lora is really overtrained. Maybe find good value for step at # epoch is the way, just need to try.
Btw with 1 epoch 100 step seem reach closer result and loss stay around 0.12
Every epoch made loss going down( for what i've see during training) but get really oversaturated result, on low cfg.

About style training, what setting you suggest to start experimenting?
 

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,143
1,954
For your first project with SD start way more simple. Don't do a complicated pose and with fingering and/or other interactions, these are very complicated and difficult even for an experienced creator.
Start with the most very basic. "A beautiful woman standing" . That's it.. If you can't get the most simple right how are you going to be able to do anything more complicated?..
Jimwalrus

As a proof of what I wrote above, here's a generation. This are the prompts:


Positive: 1 girl in a black matrix coat, standing on a rooftop, upper body shot closeup:1.5, cinematic shot, golden sidelight on her face:1.4, foggy atmosphere, neon glowing in the back, rainy day, cloudy, cyberpunk cityscape in the back, blade runner style, cyberpunk style, serious look, rough and moody atmosphere, gritty style, photoshooting

Negative: digital painting, crooked hands, off proportions, multiple persons


This is the result: 00056-2549670335.png


So that's a pretty neat outcome! She's wearing the black matrix-like coat as intended, there is the detail with the golden side lighting on her face because I gave it weight, the cyberpunk cityscape in the back, rain, foggy and rough atmosphere, and also the close up upper body shot instead of a panorama upper body shot. I had to refine the braces and weights to get this.

As you see, prompt engineering isn't my issue.

But what I can't wrap my head around is how to upscale this without getting some wildly different results. Because if I do this:


When you have an image that looks good you activate hiresfix. Press the green recycle button to re-use the seed. Set the upscaler to 4x ultrasharp and the denoising to 0.2-0.4 . I would recomend 0.3 . Then generate. Now hopefully you have something decent.
I get this result:

1697205546122.png

You don't have permission to view the spoiler content. Log in or register now.

That was with denoising strength set to 0,02 and the exact seed and prompts from the picture above. But SD composes something completely different. What did I do wrong?
 

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
Jimwalrus

As a proof of what I wrote above, here's a generation. This are the prompts:


Positive: 1 girl in a black matrix coat, standing on a rooftop, upper body shot closeup:1.5, cinematic shot, golden sidelight on her face:1.4, foggy atmosphere, neon glowing in the back, rainy day, cloudy, cyberpunk cityscape in the back, blade runner style, cyberpunk style, serious look, rough and moody atmosphere, gritty style, photoshooting

Negative: digital painting, crooked hands, off proportions, multiple persons


This is the result: View attachment 3002656


So that's a pretty neat outcome! She's wearing the black matrix-like coat as intended, there is the detail with the golden side lighting on her face because I gave it weight, the cyberpunk cityscape in the back, rain, foggy and rough atmosphere, and also the close up upper body shot instead of a panorama upper body shot. I had to refine the braces and weights to get this.

As you see, prompt engineering isn't my issue.

But what I can't wrap my head around is how to upscale this without getting some wildly different results. Because if I do this:




I get this result:

View attachment 3002662

You don't have permission to view the spoiler content. Log in or register now.

That was with denoising strength set to 0,02 and the exact seed and prompts from the picture above. But SD composes something completely different. What did I do wrong?
I'm away from my PC atm, so can't test anything, but please note "0.02" is NOT the same as 0.2!
It seems to be an error you make a lot. I don't think it's causing this, but it does affect things.
 
  • Like
Reactions: Dagg0th