[Stable Diffusion] Prompt Sharing and Learning Thread

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
It was all created in the same session. So the first one with CFG 16 was this:

View attachment 3003232

Then I turned it down to 8, leading to the picture with CFG 8 in my comparison above. Then I did 5, then 11, and then worked my way up to 20.

So I didn't restart SD or reloaded the GUI or changed any other settings except moving the CFG slider.

Maybe it was a weird bug..




Yes but higher CFG, according to the tooltip, should lead to less creativity from SD and more strictly following the prompt, so it's interesting to see that SD sticks to the same bg-generation with more creativity allowed and then starts to do more on its own while the CFG goes up. Especially because I didn't specify the background in terms of building placement.

Also, rain is generally missing. Something that was present in the OG:
(probably something I could fix with prompt weights)
View attachment 3003234
If you want the rain but it's missing try add weight to that tag: (rain:1.2). The bracket gives an additional 0.1 in weight to the tag, the bracket has to do with attention I think.

Tag: single word, Token: several words describing the same object.

You can also use the negative prompt in combination with the positive to give it even stronger emphasis:

pos : (rain:1.2)
neg: missing rain, clear weather etc.

If you add weight to the negative tags it's ofc even stronger.
 

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,144
1,954
If you want the rain but it's missing try add weight to that tag: (rain:1.2). The bracket gives an additional 0.1 in weight to the tag, the bracket has to do with attention I think.

Tag: single word, Token: several words describing the same object.

You can also use the negative prompt in combination with the positive to give it even stronger emphasis:

pos : (rain:1.2)
neg: missing rain, clear weather etc.

If you add weight to the negative tags it's ofc even stronger.
Yes I know that but thanks! :)

But another question, if I upscale with the SD script on image2image I can't specify high res steps, and I get more blurry results. How can I improve the sharpness of the output?

This is image2image from the original with 40 sample steps, 20 looked the same:

1697221786153.png

Settings:

You don't have permission to view the spoiler content. Log in or register now.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
Yes I know that but thanks! :)

But another question, if I upscale with the SD script on image2image I can't specify high res steps, and I get more blurry results. How can I improve the sharpness of the output?

This is image2image from the original with 40 sample steps, 20 looked the same:

View attachment 3003256

Settings:

You don't have permission to view the spoiler content. Log in or register now.
Try different upscalers, I redommend NMKD. You can play with denoising strength to tease out more detail. The upscaler makes the biggest difference though. Start with a low denoisng strength and then add 0.1 at a time until you get a worse result. Then go backwards 0.02 at a time until you find a sweet spot. Post processing can sometimes also add detail, I would recommend GFPGAN over Codeformer but try either and see what you prefer. You can also try using a different checkpoint model. It could potentially function almost like a refiner then. After Detailer can add more detail to the face using the face model. You could try an add-detail Lora. There are many ways to refine and tease out more detail
 
Last edited:

me3

Member
Dec 31, 2016
316
708
You also have "issues" with your prompt.
When applying wait you should be wrapping that in () as the first step, if you need more weighting you then add the numerical values.
This not only helps with readability but also helps the prompt parser. As a small tip if you select the text and hit ctrl + up it'll do this for you.
Second, you're repeatedly applying "instructions" to the same things, potentially conflicting ones as well.
Look at how many times you're mentioning style, or "in the back". This might seem to be working just fine and it might be, in this case, but next time you might be fighting with some kind of issue due to this.
Thirdly, with weighting, if you have to go beyond 1.3 it generally means you are doing something horribly wrong or something is fighting off the weighting. It potentially creates artifacts and/or distortions as well.

You keep saying you know prompting, to be brutal, you might know MJ prompting where the software fixes a whole bunch of stuff for you, but where there's far less hand holding you're gonna need to learn to write cleaner prompts. It'll make things far far easier for yourself down the line and cause you much less issues. Even in this case you're more than likely fighting your own prompt.
Strip out everything you don't need it prompts, don't add stuff because it "might" be needed. Negative prompts you can every often leave empty until the end specially when dealing with styles or certain looks because the AI has a habit of not really agreeing with us or following our logic in viewing things.

As an example, a very common negative to be included is some variation of "ugly", but for the AI that term is very wide, so if you have styles that in some way suggest things are run down, worn out, dirty, dark, dreary etc, including that in your negative will fight your choice of style. Same applies to "distorted" if you're doing some kind of cyberpunk/futuristic mix of human and machine.
 
  • Like
Reactions: Jimwalrus

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
You also have "issues" with your prompt.
When applying wait you should be wrapping that in () as the first step, if you need more weighting you then add the numerical values.
This not only helps with readability but also helps the prompt parser. As a small tip if you select the text and hit ctrl + up it'll do this for you.
Second, you're repeatedly applying "instructions" to the same things, potentially conflicting ones as well.
Look at how many times you're mentioning style, or "in the back". This might seem to be working just fine and it might be, in this case, but next time you might be fighting with some kind of issue due to this.
Thirdly, with weighting, if you have to go beyond 1.3 it generally means you are doing something horribly wrong or something is fighting off the weighting. It potentially creates artifacts and/or distortions as well.

You keep saying you know prompting, to be brutal, you might know MJ prompting where the software fixes a whole bunch of stuff for you, but where there's far less hand holding you're gonna need to learn to write cleaner prompts. It'll make things far far easier for yourself down the line and cause you much less issues. Even in this case you're more than likely fighting your own prompt.
Strip out everything you don't need it prompts, don't add stuff because it "might" be needed. Negative prompts you can every often leave empty until the end specially when dealing with styles or certain looks because the AI has a habit of not really agreeing with us or following our logic in viewing things.

As an example, a very common negative to be included is some variation of "ugly", but for the AI that term is very wide, so if you have styles that in some way suggest things are run down, worn out, dirty, dark, dreary etc, including that in your negative will fight your choice of style. Same applies to "distorted" if you're doing some kind of cyberpunk/futuristic mix of human and machine.
It would be very intersting to see the end result of what you're suggesting and would be helpful with a generated image to go along with you tips.
In general it's correct what you're saying, I just don't agree with everything. Using ugly for instance can sometimes reduce a deformed face and make it sharper etc. It depends very much on how the checkpoint has been trained and how it responds on the input of prompt and settings. I often use weight more than 1.3, though perhaps I could get better result doing something else. Knowing what this something else is, it's a different thing all together. Sometimes we have to do our best with what we know. Talks on prompting and how to structure it is not done enough and is very welcome imo. Very good tip about uisng ctrl + up btw, this was new to me. :) (y)
"In the back" is shorthand for in the background in his prompt I suspect. It could probably have been reduced to one line but I don't see any contradiction there. The use of style is the same in that it could probably been reduced to one line but no major contradiction as far as I see. Again it would be interesting to see an example, meaning that you edit the prompt and post it so we can see how you would do it and with a generated image to see the end result.
 
Last edited:

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,144
1,954
Try different upscalers, I redommend NMKD. You can play with denoising strength to tease out more detail. The upscaler makes the biggest difference though.
But the same upscaler as used in txt2img should work equally well in img2img shouldn't it? Right now it comes out way more blurred than on txt2img, is that due to the missing control over high res steps?


When applying wait you should be wrapping that in () as the first step, if you need more weighting you then add the numerical values.
Afaik the ( ) are just multiplicators, so it's a quick way to emphasise what's more important and what's not. If I write:

(apples), oranges, ((carrots))

then the importance-ratio between those 3 would be 2:1:3.

If you wrap everything into braces, everything is equally important, which means you could've just not used them at all. No?


Second, you're repeatedly applying "instructions" to the same things, potentially conflicting ones as well.
Look at how many times you're mentioning style, or "in the back".
The back instruction is important otherwise one of these things might appear somewhere on the characters level which is not desired. If background elements would conflict (it never happened yet), you can just specificy their postion, e.g. (tall building in the back on the left), (big moon in the back on the right top corner),[...]

And the different styles are intended, so that it doesn't just copy 1 specific style, but rather creates something new considering a wide range of styles. It's just to get down a certain vibe.

Thirdly, with weighting, if you have to go beyond 1.3 it generally means you are doing something horribly wrong or something is fighting off the weighting. It potentially creates artifacts and/or distortions as well.
Interesting. I had situations where 1.3 wasn't enough and higher values got the job done, I might run some X/Y tests with different weighting to see about the articafts and distortions, thanks.


You keep saying you know prompting, to be brutal, you might know MJ prompting where the software fixes a whole bunch of stuff for you, but where there's far less hand holding you're gonna need to learn to write cleaner prompts.
In terms of prompting I don't see any difference between MJ and SD yet. MJ doesn't do hand holding. You also have to be very precise and specific, need negative prompts and also add weighting to the prompts.


Even in this case you're more than likely fighting your own prompt.
Why do you believe that? The OG picture turned out perfectly fine and had every detail I wanted to SD to include. It's actually a good example for prompting as it absolutely nailed the desire vibe, atmosphere, scenery, side lighting, clothing and everything else.
 
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
It would be very intersting to see the result and would be helpful with a generated image to go along with you tips.
In general it's correct what you're saying, I just don't agree with everything. Using ugly for instance can sometimes reduce a deformed face and make it sharper etc. It depends very much on how the checkpoint has been trained and how it responds on the input of prompt and settings. I often use weight more than 1.3, though perhaps I could get better result doing something else. Knowing what this something else is, it's a different thing all together. Sometimes we have to do our best with what we know. Talks on prompting and how to structure it is not done enough and is very welcome imo. Very good tip about uisng ctrl + up btw, this was new to me. :) (y)
The "problem" with giving specific examples is that it's likely to be "case dependent". The model used with have its biases, the order of things in the prompt itself has an obvious impact (which is why i'm suggesting keeping them clean/clear/organized). If i come across an example at some point i'll try to remember to post it
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
But the same upscaler as used in txt2img should work equally well in img2img shouldn't it? Right now it comes out way more blurred than on txt2img, is that due to the missing control over high res steps?




Afaik the ( ) are just multiplicators, so it's a quick way to emphasise what's more important and what's not. If I write:

(apples), oranges, ((carrots))

then the ratio between those 3 would be 2:1:3.

If you wrap everything into braces, everything is equally important, which means you could've just not used them at all. No?




The back instruction is important otherwise one of these things might appear somewhere on the characters level which is not desired. If background elements would conflict (it never happened yet), you can just specificy their postion, e.g. (tall building in the back on the left), (big moon in the back on the right top corner),[...]

And the different styles are intended, so that it doesn't just copy 1 specific style, but rather creates something new considering a wide range of styles. It's just to get down a certain vibe.



Interesting. I had situations where 1.3 wasn't enough and higher values got the job done, I might run some X/Y tests with different weighting to see about the articafts and distortions, thanks.




In terms of prompting I don't see any difference between MJ and SD yet. MJ doesn't do hand holding. You also have to be very precise and specific, need negative prompts and also add weighting to the prompts.




Why do you believe that? The OG picture turned out perfectly fine and had every detail I wanted to SD to include.
The special symbols works differently in SD compared to MJ. You need to try forget some things that you learned with using MJ.
Brackets in SD is the equivalent of weight not a multiplier. ( ) means 1.0 +0.1 =1.1 in weight and attention. Square brackets only works in some ways and then it means the opposite of the normal brackets in that they reduce weight = to 0.1 .
I don't think me3 meant to use brackets on everything. If you did it too much, yes then it could become pointless. This was what he was getting at though, using words in the prompt that would contradict each other and cancel each other out. I didn't see much of that though, just a some more structure and simplification needed.
 
  • Like
Reactions: devilkkw

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
The "problem" with giving specific examples is that it's likely to be "case dependent". The model used with have its biases, the order of things in the prompt itself has an obvious impact (which is why i'm suggesting keeping them clean/clear/organized). If i come across an example at some point i'll try to remember to post it
I meant to do a hands on example, take his prompt and do with it as you suggested, it would be a better learning aid and also a generated image so we can see the end result. I understand if you don't have time or the will. It would just be even better.
 

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
Afaik the ( ) are just multiplicators, so it's a quick way to emphasise what's more important and what's not. If I write:

(apples), oranges, ((carrots))

then the importance-ratio between those 3 would be 2:1:3.

If you wrap everything into braces, everything is equally important, which means you could've just not used them at all. No?
Not quite. In SD, the braces are 0.1 additions.
So "plump" is plump at 1 strength.
"(plump)" is plump at 1.1 strength.
"((((plump))))" is plump at 1.4 strength.
"(plump:1.2)" is plump at either 1.2, 1.21 or 1.3 (it depends who you ask!)

"big tits" is big and tits SEPARATELY at strength 1. SD sort of realises they are together, especially with commas before & after, but not reliably.
"(big tits)" is big tits at strength 1.1 - the braces help SD lump them together as a single token.

So your example of "(apples), oranges, ((carrots))" is actually apples at 1.1, oranges at 1 and carrots at 1.2.
 

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,144
1,954
The special symbols works differently in SD compared to MJ. You need to try forget some things that you learned with using MJ.
Brackets in SD is the equivalent of weight not a multiplier. ( ) means 1.0 +0.1 =1.1 in weight and attention.
This is what I was referring to:

1697226497758.png

 
  • Like
Reactions: Mr-Fox and devilkkw

me3

Member
Dec 31, 2016
316
708
The special symbols works differently in SD compared to MJ. You need to try forget some things that you learned with using MJ.
Brackets in SD is the equivalent of weight not a multiplier. ( ) means 1.0 +0.1 =1.1 in weight and attention. Square brackets only works in some ways and then it means the opposite of the normal brackets in that they reduce weight = to 0.1 .
I don't think me3 meant to use brackets on everything. If you did it too much, yeas then it could become pointless. This was what he was getting at though, using words in the prompt that would contradict each other and cancel each other out. I didn't see much of that though, just a some more structure and simplification needed.
I don't have prompt for easy c/p atm, but i believe part of it was "<something>, upper body shot closeup:1.5, <something more>".
So not only is this slightly confusing to a reader, is the intention to apply the weight to just "closeup" or everything between the commas.
The same "confusion" can apply to the parser too, i haven't been digging around in that so see how it works in detail, but i have had a fair bit of dealings with other parsers and they are all "fragile" at best so as a general rule you don't want to push the limits. So simply wrapping everything that's intended to be weighted is a much simpler option for everyone involved.

It wouldn't surprise me if :1.5 outside of parenthesis doesn't actually weight given that the behaviour of the "editor" is to apply both () and :1.1 in the same action. But the parser in a1111 is different that comfy and i'm sure other "frontends" can have their own parser so despite all of them being SD, some behaviour might not be completely consistent
 
  • Like
Reactions: Mr-Fox and devilkkw

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,144
1,954
But it's actually a multiplier of 1.1, not an addition of 1 which would result in n:1, n:2, n:3 etc.
That's why I wrote "importance-ratio", to undermine that it's not an absolute ratio :p

The point was that putting everything into parantheses is pointless as everything gets the same multiplier and therefore importance.
 

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
That's why I wrote "importance-ratio", to undermine that it's not an absolute ratio :p

The point was that putting everything into parantheses is pointless as everything gets the same multiplier and therefore importance.
Ah, OK. Wasn't clear - if anything it did seem like you were implying a strength ratio of 1:2:3
But yes, if you weight everything equally it will basically treat everything the same. Unless it feels like fucking you around that day...
Another rabbit hole to go into is prompt order - there's a huge debate as to whether the prompts should be in strict order of importance, or whether it's just the first 75 tokens are equal, then the next 75 and so on.
 
  • Like
Reactions: Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
I don't have prompt for easy c/p atm, but i believe part of it was "<something>, upper body shot closeup:1.5, <something more>".
So not only is this slightly confusing to a reader, is the intention to apply the weight to just "closeup" or everything between the commas.
The same "confusion" can apply to the parser too, i haven't been digging around in that so see how it works in detail, but i have had a fair bit of dealings with other parsers and they are all "fragile" at best so as a general rule you don't want to push the limits. So simply wrapping everything that's intended to be weighted is a much simpler option for everyone involved.

It wouldn't surprise me if :1.5 outside of parenthesis doesn't actually weight given that the behaviour of the "editor" is to apply both () and :1.1 in the same action. But the parser in a1111 is different that comfy and i'm sure other "frontends" can have their own parser so despite all of them being SD, some behaviour might not be completely consistent
(y)
 

Fuchsschweif

Well-Known Member
Sep 24, 2019
1,144
1,954
Another rabbit hole to go into is prompt order - there's a huge debate as to whether the prompts should be in strict order of importance, or whether it's just the first 75 tokens are equal, then the next 75 and so on.
Shouldn't that be clarified in the official documentation?
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
Ah, OK. Wasn't clear - if anything it did seem like you were implying a strength ratio of 1:2:3
But yes, if you weight everything equally it will basically treat everything the same. Unless it feels like fucking you around that day...
Another rabbit hole to go into is prompt order - there's a huge debate as to whether the prompts should be in strict order of importance, or whether it's just the first 75 tokens are equal, then the next 75 and so on.
Yes I have also read that the order has an importance ranking effect. If anyone else than just me feel the need to read up on prompting, I often come back to this guide:


BTW, fudge I was wrong about ( ) [ ] syntax. Impossible.. :LOL:
 
  • Like
Reactions: Jimwalrus