[Stable Diffusion] Prompt Sharing and Learning Thread

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
It was the same scenario IIRC, I just wanted to upscale a low res picture that I liked as close to the original as possible. So in that case image2image is then the way to go..
Ok. Yes if you have already generated an image and you wish to upscale it to get more detail and hiresfix with same seed and prompt doesn't give the same image then the safe bet is to simply upscale the original image with SD Upscale script in img2img tab. This it however not like using the normal upscale in img2img or in extra tab. Those are only enlarging the image, they lack the hires steps. You need to install "Sd Upscale" in extension tab. In available click "load from" and find SD Upscale in the database, press install. When it's done go to "installed" press "apply and restart". Now you can find SD Upscale in the script menu in img2img tab. It works just like hiresfix.
 
  • Like
Reactions: Sepheyer

me3

Member
Dec 31, 2016
316
708
Yeah but maybe in this case, following the prompt more strict equals = deviating further away from the seed. Because it tries to add more new stuff based on the prompts instead of gathering from the seed. At least that would explain why CFG 16 doesn't work but 8 does.
While likely not a correct way to explain it, it might be enough to illustrate.
Seed is the random noise created, prompt is applied to clean that up. No matter how hard to apply the prompt the initial noise is still the same.

With the amount of seemingly small things that can have large effects on the end results there's no need to look for an answer in one of the things that's fairly specific in how it works.
 

Jimwalrus

Active Member
Sep 15, 2021
895
3,303
For ref, the seed is just a number. Nothing else. It is used to create a consistent form in the starting static. It simply ensures consistency is possible at all (random static would give random results). It does NOT affect the prompt and the CFG is simply how hard SD tries to adhere to the prompts (often to the detriment of image quality at high levels).

Remember, you can use the same seed number to generate a picture of a corgi taking a dump on a skateboard or one of Selena Gomez with her tits out, entirely depending on prompts, LoRAs, Embeddings, Checkpoints etc.
There will be no 'corgi-ness' to Selena or 'Selena-ness' to the corgi.

Personally, I would usually recommend CFGs between 4 and 11 with the sweet spot usually being 5-8.
There is the option to dynamically adjust the CFG during generation, usually starting high to 'force what you really want', then turn it down to stop some of the worst effects of high CFGs. It gives mixed results and I've pretty much stopped using it.
 
Last edited:

Sharinel

Active Member
Dec 23, 2018
508
2,102
What's the difference between my default one and the one you picked? Aren't the steps just determined by the step slider? Or do you mean that it works better with higher step numbers?
Woops, I said Adaptive, but the correct terminology is Ancestral (I am old and senile)

This is a good explanation -

Basically it comes down to :- Non Ancestral samplers will get to their 'endgame' image after X amount of steps, then any steps after X doesn't add anything to the image, however Ancestral steps never get to that stage, it will always slightly amend the picture every step
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
Woops, I said Adaptive, but the correct terminology is Ancestral (I am old and senile)

This is a good explanation -

Basically it comes down to :- Non Ancestral samplers will get to their 'endgame' image after X amount of steps, then any steps after X doesn't add anything to the image, however Ancestral steps never get to that stage, it will always slightly amend the picture every step
Did you get a good sense from the article what it means when the image or sampler converge? Or perhaps you know from other sources..
 

Fuchsschweif

Active Member
Sep 24, 2019
954
1,514
When it's done go to "installed" press "apply and restart". Now you can find SD Upscale in the script menu in img2img tab. It works just like hiresfix.
Yes I already have that installed, that's how I use it.

It does NOT affect the prompt and the CFG is simply how hard SD tries to adhere to the prompts (often to the detriment of image quality at high levels)
It's just been confusing to me because in MJ using the same seed with the same prompts it was generated with would lead to the exact same picture (iirc).

Anyways, it's now odd because the CFG doesn't change my picture into something completely different anymore, and I don't know why SD now finally "locked" in. At first it was something wildly different at CFG 16 as you all saw earlier, now working my way back up from 5 (running some tests), that doesn't happen anymore.

Here's how the CFG affects the output:


CFG 5 (with 50 high res steps instead of 40 like the other tests)

00066-2549670335.png

CFG 8

00065-2549670335.png

CFG 11

00067-2549670335.png

CFG 14

00068-2549670335.png

CFG 17

00069-2549670335.png

CFG 20
00070-2549670335.png


Except her turning more and more demonic (for whatever reason lol) nothing changes much.
 
Last edited:

Jimwalrus

Active Member
Sep 15, 2021
895
3,303
I think the issue was having a variation seed in there somewhere - possibly even after deselecting the option (SD sometimes has these brainfarts).

Also, good work on the CFG differential images - although the effect is usually much more marked at higher levels.
 
  • Like
Reactions: Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
Yes I already have that installed, that's how I use it.



It's just been confusing to me because in MJ using the same seed with the same prompts it was generated with would lead to the exact same picture (irrc).

Anyways, it's now odd because the CFG doesn't change my picture into something completely different anymore, and I don't know why SD now finally "locked" in. At first it was something wildly different at CFG 16 as you all saw earlier, now working my way back up from 5 (running some tests), that doesn't happen anymore.

Here's how the CFG affects the output:


CFG 5 (with 50 high res steps instead of 40 like the other tests)

View attachment 3003209

CFG 8

View attachment 3003210

CFG 11

View attachment 3003211

CFG 14

View attachment 3003200

CFG 17

View attachment 3003201

CFG 20
View attachment 3003204


Except her turning more and more demonic (for whatever reason lol) nothing changes much.
You can see a variation in the cityscape background. Perhaps it would be changing anyways slightly each generated image.
 

Fuchsschweif

Active Member
Sep 24, 2019
954
1,514
You can see a variation in the cityscape background. Perhaps it would be changing anyways slightly each generated image.
Yes but higher CFG, according to the tooltip, should lead to less creativity from SD and more strictly following the prompt, so it's interesting to see that SD sticks to the same bg-generation with more creativity allowed and then starts to do more on its own while the CFG goes up. Especially because I didn't specify the background in terms of building placement.

Also, rain is generally missing. Something that was present in the OG:
(probably something I could fix with prompt weights)
00056-2549670335.png
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
It was all created in the same session. So the first one with CFG 16 was this:

View attachment 3003232

Then I turned it down to 8, leading to the picture with CFG 8 in my comparison above. Then I did 5, then 11, and then worked my way up to 20.

So I didn't restart SD or reloaded the GUI or changed any other settings except moving the CFG slider.

Maybe it was a weird bug..




Yes but higher CFG, according to the tooltip, should lead to less creativity from SD and more strictly following the prompt, so it's interesting to see that SD sticks to the same bg-generation with more creativity allowed and then starts to do more on its own while the CFG goes up. Especially because I didn't specify the background in terms of building placement.

Also, rain is generally missing. Something that was present in the OG:
(probably something I could fix with prompt weights)
View attachment 3003234
If you want the rain but it's missing try add weight to that tag: (rain:1.2). The bracket gives an additional 0.1 in weight to the tag, the bracket has to do with attention I think.

Tag: single word, Token: several words describing the same object.

You can also use the negative prompt in combination with the positive to give it even stronger emphasis:

pos : (rain:1.2)
neg: missing rain, clear weather etc.

If you add weight to the negative tags it's ofc even stronger.
 

Fuchsschweif

Active Member
Sep 24, 2019
954
1,514
If you want the rain but it's missing try add weight to that tag: (rain:1.2). The bracket gives an additional 0.1 in weight to the tag, the bracket has to do with attention I think.

Tag: single word, Token: several words describing the same object.

You can also use the negative prompt in combination with the positive to give it even stronger emphasis:

pos : (rain:1.2)
neg: missing rain, clear weather etc.

If you add weight to the negative tags it's ofc even stronger.
Yes I know that but thanks! :)

But another question, if I upscale with the SD script on image2image I can't specify high res steps, and I get more blurry results. How can I improve the sharpness of the output?

This is image2image from the original with 40 sample steps, 20 looked the same:

1697221786153.png

Settings:

You don't have permission to view the spoiler content. Log in or register now.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
Yes I know that but thanks! :)

But another question, if I upscale with the SD script on image2image I can't specify high res steps, and I get more blurry results. How can I improve the sharpness of the output?

This is image2image from the original with 40 sample steps, 20 looked the same:

View attachment 3003256

Settings:

You don't have permission to view the spoiler content. Log in or register now.
Try different upscalers, I redommend NMKD. You can play with denoising strength to tease out more detail. The upscaler makes the biggest difference though. Start with a low denoisng strength and then add 0.1 at a time until you get a worse result. Then go backwards 0.02 at a time until you find a sweet spot. Post processing can sometimes also add detail, I would recommend GFPGAN over Codeformer but try either and see what you prefer. You can also try using a different checkpoint model. It could potentially function almost like a refiner then. After Detailer can add more detail to the face using the face model. You could try an add-detail Lora. There are many ways to refine and tease out more detail
 
Last edited:

me3

Member
Dec 31, 2016
316
708
You also have "issues" with your prompt.
When applying wait you should be wrapping that in () as the first step, if you need more weighting you then add the numerical values.
This not only helps with readability but also helps the prompt parser. As a small tip if you select the text and hit ctrl + up it'll do this for you.
Second, you're repeatedly applying "instructions" to the same things, potentially conflicting ones as well.
Look at how many times you're mentioning style, or "in the back". This might seem to be working just fine and it might be, in this case, but next time you might be fighting with some kind of issue due to this.
Thirdly, with weighting, if you have to go beyond 1.3 it generally means you are doing something horribly wrong or something is fighting off the weighting. It potentially creates artifacts and/or distortions as well.

You keep saying you know prompting, to be brutal, you might know MJ prompting where the software fixes a whole bunch of stuff for you, but where there's far less hand holding you're gonna need to learn to write cleaner prompts. It'll make things far far easier for yourself down the line and cause you much less issues. Even in this case you're more than likely fighting your own prompt.
Strip out everything you don't need it prompts, don't add stuff because it "might" be needed. Negative prompts you can every often leave empty until the end specially when dealing with styles or certain looks because the AI has a habit of not really agreeing with us or following our logic in viewing things.

As an example, a very common negative to be included is some variation of "ugly", but for the AI that term is very wide, so if you have styles that in some way suggest things are run down, worn out, dirty, dark, dreary etc, including that in your negative will fight your choice of style. Same applies to "distorted" if you're doing some kind of cyberpunk/futuristic mix of human and machine.
 
  • Like
Reactions: Jimwalrus

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
You also have "issues" with your prompt.
When applying wait you should be wrapping that in () as the first step, if you need more weighting you then add the numerical values.
This not only helps with readability but also helps the prompt parser. As a small tip if you select the text and hit ctrl + up it'll do this for you.
Second, you're repeatedly applying "instructions" to the same things, potentially conflicting ones as well.
Look at how many times you're mentioning style, or "in the back". This might seem to be working just fine and it might be, in this case, but next time you might be fighting with some kind of issue due to this.
Thirdly, with weighting, if you have to go beyond 1.3 it generally means you are doing something horribly wrong or something is fighting off the weighting. It potentially creates artifacts and/or distortions as well.

You keep saying you know prompting, to be brutal, you might know MJ prompting where the software fixes a whole bunch of stuff for you, but where there's far less hand holding you're gonna need to learn to write cleaner prompts. It'll make things far far easier for yourself down the line and cause you much less issues. Even in this case you're more than likely fighting your own prompt.
Strip out everything you don't need it prompts, don't add stuff because it "might" be needed. Negative prompts you can every often leave empty until the end specially when dealing with styles or certain looks because the AI has a habit of not really agreeing with us or following our logic in viewing things.

As an example, a very common negative to be included is some variation of "ugly", but for the AI that term is very wide, so if you have styles that in some way suggest things are run down, worn out, dirty, dark, dreary etc, including that in your negative will fight your choice of style. Same applies to "distorted" if you're doing some kind of cyberpunk/futuristic mix of human and machine.
It would be very intersting to see the end result of what you're suggesting and would be helpful with a generated image to go along with you tips.
In general it's correct what you're saying, I just don't agree with everything. Using ugly for instance can sometimes reduce a deformed face and make it sharper etc. It depends very much on how the checkpoint has been trained and how it responds on the input of prompt and settings. I often use weight more than 1.3, though perhaps I could get better result doing something else. Knowing what this something else is, it's a different thing all together. Sometimes we have to do our best with what we know. Talks on prompting and how to structure it is not done enough and is very welcome imo. Very good tip about uisng ctrl + up btw, this was new to me. :) (y)
"In the back" is shorthand for in the background in his prompt I suspect. It could probably have been reduced to one line but I don't see any contradiction there. The use of style is the same in that it could probably been reduced to one line but no major contradiction as far as I see. Again it would be interesting to see an example, meaning that you edit the prompt and post it so we can see how you would do it and with a generated image to see the end result.
 
Last edited:

Fuchsschweif

Active Member
Sep 24, 2019
954
1,514
Try different upscalers, I redommend NMKD. You can play with denoising strength to tease out more detail. The upscaler makes the biggest difference though.
But the same upscaler as used in txt2img should work equally well in img2img shouldn't it? Right now it comes out way more blurred than on txt2img, is that due to the missing control over high res steps?


When applying wait you should be wrapping that in () as the first step, if you need more weighting you then add the numerical values.
Afaik the ( ) are just multiplicators, so it's a quick way to emphasise what's more important and what's not. If I write:

(apples), oranges, ((carrots))

then the importance-ratio between those 3 would be 2:1:3.

If you wrap everything into braces, everything is equally important, which means you could've just not used them at all. No?


Second, you're repeatedly applying "instructions" to the same things, potentially conflicting ones as well.
Look at how many times you're mentioning style, or "in the back".
The back instruction is important otherwise one of these things might appear somewhere on the characters level which is not desired. If background elements would conflict (it never happened yet), you can just specificy their postion, e.g. (tall building in the back on the left), (big moon in the back on the right top corner),[...]

And the different styles are intended, so that it doesn't just copy 1 specific style, but rather creates something new considering a wide range of styles. It's just to get down a certain vibe.

Thirdly, with weighting, if you have to go beyond 1.3 it generally means you are doing something horribly wrong or something is fighting off the weighting. It potentially creates artifacts and/or distortions as well.
Interesting. I had situations where 1.3 wasn't enough and higher values got the job done, I might run some X/Y tests with different weighting to see about the articafts and distortions, thanks.


You keep saying you know prompting, to be brutal, you might know MJ prompting where the software fixes a whole bunch of stuff for you, but where there's far less hand holding you're gonna need to learn to write cleaner prompts.
In terms of prompting I don't see any difference between MJ and SD yet. MJ doesn't do hand holding. You also have to be very precise and specific, need negative prompts and also add weighting to the prompts.


Even in this case you're more than likely fighting your own prompt.
Why do you believe that? The OG picture turned out perfectly fine and had every detail I wanted to SD to include. It's actually a good example for prompting as it absolutely nailed the desire vibe, atmosphere, scenery, side lighting, clothing and everything else.
 
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
It would be very intersting to see the result and would be helpful with a generated image to go along with you tips.
In general it's correct what you're saying, I just don't agree with everything. Using ugly for instance can sometimes reduce a deformed face and make it sharper etc. It depends very much on how the checkpoint has been trained and how it responds on the input of prompt and settings. I often use weight more than 1.3, though perhaps I could get better result doing something else. Knowing what this something else is, it's a different thing all together. Sometimes we have to do our best with what we know. Talks on prompting and how to structure it is not done enough and is very welcome imo. Very good tip about uisng ctrl + up btw, this was new to me. :) (y)
The "problem" with giving specific examples is that it's likely to be "case dependent". The model used with have its biases, the order of things in the prompt itself has an obvious impact (which is why i'm suggesting keeping them clean/clear/organized). If i come across an example at some point i'll try to remember to post it
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
But the same upscaler as used in txt2img should work equally well in img2img shouldn't it? Right now it comes out way more blurred than on txt2img, is that due to the missing control over high res steps?




Afaik the ( ) are just multiplicators, so it's a quick way to emphasise what's more important and what's not. If I write:

(apples), oranges, ((carrots))

then the ratio between those 3 would be 2:1:3.

If you wrap everything into braces, everything is equally important, which means you could've just not used them at all. No?




The back instruction is important otherwise one of these things might appear somewhere on the characters level which is not desired. If background elements would conflict (it never happened yet), you can just specificy their postion, e.g. (tall building in the back on the left), (big moon in the back on the right top corner),[...]

And the different styles are intended, so that it doesn't just copy 1 specific style, but rather creates something new considering a wide range of styles. It's just to get down a certain vibe.



Interesting. I had situations where 1.3 wasn't enough and higher values got the job done, I might run some X/Y tests with different weighting to see about the articafts and distortions, thanks.




In terms of prompting I don't see any difference between MJ and SD yet. MJ doesn't do hand holding. You also have to be very precise and specific, need negative prompts and also add weighting to the prompts.




Why do you believe that? The OG picture turned out perfectly fine and had every detail I wanted to SD to include.
The special symbols works differently in SD compared to MJ. You need to try forget some things that you learned with using MJ.
Brackets in SD is the equivalent of weight not a multiplier. ( ) means 1.0 +0.1 =1.1 in weight and attention. Square brackets only works in some ways and then it means the opposite of the normal brackets in that they reduce weight = to 0.1 .
I don't think me3 meant to use brackets on everything. If you did it too much, yes then it could become pointless. This was what he was getting at though, using words in the prompt that would contradict each other and cancel each other out. I didn't see much of that though, just a some more structure and simplification needed.
 
  • Like
Reactions: devilkkw