[Stable Diffusion] Prompt Sharing and Learning Thread

Sepheyer

Well-Known Member
Dec 21, 2020
1,531
3,618
Yes, you're spot on. I have also noticed this. In order to get something lifelike or based in reality use phrases like photography and specify what type it is. For example glamour photo or artistic photography, professional photo etc. Time of day and light conditions etc.
Also use camera specs and descriptive terms used in photography. Terms describing composition etc. If you use descriptive terms used in rendering or videogame engines, you will get visuals more towards 3d, CGI or rendering. The same goes for animation and cartoons etc. If this is what you want, use appropriate terms.
Let's do an empirical test. Left one has "ultrarealistic", bunch camera terms, DoF in it, the right one doesn't. Same seeds, prompts otherwise. I'd say they they look like they are from the same batch - it is entirely conceivable. I just haven't ran the right image's prompt a statistically significant number of times, but I know very well that it does fall within what I would get with the left prompt. So, that's a field test here.

a_12867_.png a_13138_.png

You don't have permission to view the spoiler content. Log in or register now.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,794
Let's do an empirical test. Left one has "ultrarealistic", bunch camera terms, DoF in it, the right one doesn't. Same seeds, prompts otherwise. I'd say they they look like they are from the same batch - it is entirely conceivable. I just haven't ran the right image's prompt a statistically significant number of times, but I know very well that it does fall within what I would get with the left prompt. So, that's a field test here.

View attachment 2782721 View attachment 2782751

You don't have permission to view the spoiler content. Log in or register now.
I agree with you. How the model has been trained dictates the rest. The prompt is still the most powerful tool we have but it can't do what the model has not been trained to do. Mine and me3's speculations and generalizations are still valid, but it depends on the model being used and how it has been trained and thus how it responds to the prompt. Using terms like ultra realistic might not give you photo quality, depending on the model it might mean realistic render instead.
Conclusion: no1 chekpoint model, no2 prompt in relation to the model, no3 extensions in relation to the model and prompt. Don't throw everything including the kitchen sink at a the prompt, it might be that you need to switch model first. Now that you have the appropriate model, see how it responds to simple descriptive phrases and proceed accordingly. If the appropriate prompt with the appropriate model doesn't finish the job add extensions and Lora's or Ti's . This Ai text 2 image render thing is still an experimental working prototype and will probably remain experimental for a long time. Since there are so many different people and teams working on it and tools and extensions it's a miracle that it works as well as it does. There will be bugs and "teething issues" because of it. We just have to hold on to the railing and try to roll with it.
 
  • Like
Reactions: Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,531
3,618
I agree with you. How the model has been trained dictates the rest. The prompt is still the most powerful tool we have but it can't do what the model has not been trained to do. Mine and me3's speculations and generalizations are still valid, but it depends on the model being used and how it has been trained and thus how it responds to the prompt. Using terms like ultra realistic might not give you photo quality, depending on the model it might mean realistic render instead.
Conclusion: no1 chekpoint model, no2 prompt in relation to the model, no3 extensions in relation to the model and prompt. Don't throw everything including the kitchen sink at a the prompt, it might be that you need to switch model first. Now that you have the appropriate model, see how it responds to simple descriptive phrases and proceed accordingly. If the appropriate prompt with the appropriate model doesn't finish the job add extensions and Lora's or Ti's . This Ai text 2 image render thing is still an experimental working prototype and will probably remain experimental for a long time. Since there are so many different people and teams working on it and tools and extensions it's a miracle that it works as well as it does. There will be bugs and "teething issues" because of it. We just have to hold on to the railing and try to roll with it.
Having said all that about this doesnt matter and that doesn't matter, I keep including "ultrarealistic" and "200mm lens" into my prompts as a lucky charm :)

Probably, even when not understood by SD, this has an effect of a unique look that a certain particular string, a unique ID that say only your images will have. Kinda the same as signature ID of "MrFoxAwesomeAIArtiste" would give you. Anyways, back to the 200mm lens:

a_13179_.png
You don't have permission to view the spoiler content. Log in or register now.
 
Last edited:

me3

Member
Dec 31, 2016
316
708
models are meant to understand lens "stuff" as that is the kind of things images often come with in the data itself, so should be a easy thing to extract and include in learning. If it understands it correctly is a totally different question though

Has anyone found or figured out any way to create a known "shape/construct", but have it be made up of a different "material" than it normal would be. IE like a human but it's made of an element, like the Human torch, Iceman or the Emma Frosts diamond form.
Or even more fantasy base things like elemental Atronachs or animals
 
  • Like
Reactions: devilkkw and Mr-Fox

me3

Member
Dec 31, 2016
316
708
i tried to include a better version of the grid but after cutting it into sections, compressing it down to low quality jpg and still couldn't upload it, i figured the "low quality generated" grid couldn't be all that much worse.

Based upon this "test" i'd say the "word" seems to have much the same effect as a lot of other random words seems to have in many cases, it changes the generation simply because there's another word in there.
Second thing i noticed is that either my sd1.5 is really screwed up or it's struggling badly...
(if anyone want any of the specific images uploaded let me know, but with the prompt it should be an easy thing to generate)
xyz_grid-0003-2856367958.jpg

You don't have permission to view the spoiler content. Log in or register now.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,794
models are meant to understand lens "stuff" as that is the kind of things images often come with in the data itself, so should be a easy thing to extract and include in learning. If it understands it correctly is a totally different question though

Has anyone found or figured out any way to create a known "shape/construct", but have it be made up of a different "material" than it normal would be. IE like a human but it's made of an element, like the Human torch, Iceman or the Emma Frosts diamond form.
Or even more fantasy base things like elemental Atronachs or animals
There are both checkpoints and Lora's, Ti's etc that does this. Checkpoints for furies, statues of various materials, fantasy stuff for fire and ice etc. Ai doesn't "understand" anything. There is no intelligence here, the term "AI" is overused. This is more akin to machine learning and logarithms. The training consists of images with a prompt or description of each images and the engine "learns" to associate the images with it's corresponding prompt or description. If a keyword or phrase is used with each prompt because there is a consistent theme, this word or prompt becomes a trigger. For example my Lora has the unintended trigger word "headband". The reason is that in the Daz3d renders I used for the training, the subject has a headband in almost all images. All prompts also has the subjects name "kendra" so this is also a "trigger word" ofc. I think that most fantasy based checkpoint models can do what you are talking about to an extent. If the result is good or not is another question.
If you are after something specific and a particular look I think you would need to do some training of your own. Would this be the case we will help anyway we can. I would start by watching the basic videos by Sebastian Kamph and Aitrepeneur and then go to the Lora training renpy I have linked to many times and read up. Even if you are going to create a Ti, because it has so much good info on the prep work not only the training.
 

devilkkw

Member
Mar 17, 2021
308
1,053
models are meant to understand lens "stuff" as that is the kind of things images often come with in the data itself, so should be a easy thing to extract and include in learning. If it understands it correctly is a totally different question though

Has anyone found or figured out any way to create a known "shape/construct", but have it be made up of a different "material" than it normal would be. IE like a human but it's made of an element, like the Human torch, Iceman or the Emma Frosts diamond form.
Or even more fantasy base things like elemental Atronachs or animals
Did you mean something like this?
You don't have permission to view the spoiler content. Log in or register now.

if yes, this is prompt i used:
You don't have permission to view the spoiler content. Log in or register now.

no negative.

#mat#= material you want
(check image in png info)

Why no negative?
Sometimes negative waste some concept, specially on high fantasy concept image.
A good way (there's also a way on how i test checkpoint) is start with simple prompt and check how good is checkpoint you are using. Just change cfg and see how it change. Found a good cfg that allow you having good result without waste your promt, then start to implement it, see what terms put out of concept your promt.

For example i read a post about "realism". if you use term's like "photorealistic", "ultra realistic" you can't reach "realism" because those terms are associated to a rendering engine. Change it to "photo" or "photography" give better result on "realism".
This is what i've understand in many try, maybe correct me. Also i'm speaking about using only prompt without any negative,TI or LORA.
In same way negative prompt acting on image and some terms push out your concept.
So i think a better way is starting by really simple prompt and adding terms step by step.

Also understanding if checkpoint you are using is good or not for what you want, by simple test like this.
I know there are too many checkpoint, i speak about civitai, and download and try it is a long work.
I have a simply way to decide what model try: check sample image and cfg.
What i mean? is simply: if image have not data, i don't download the model.
And if i see good sample image but cfg is low, i don't try model. This is because most model giving good sample at cfg 4.5/6 give bad result at 11/20 cfg( chromatic aberration in most case).
This is my method, based on my experience. maybe i'm in fault about that.
 

me3

Member
Dec 31, 2016
316
708
In theory yes, at least in how we interpret that prompt. Unfortunately the AI doesn't really agree.
It's very much hit and miss, only a few "materials" work, and even then it's quite a bit of misses, the rest it mainly adds it to the background/scenario or uses it in some kind of "covering/clothing" way.

Since i'd already tested this, but were lacking a large "grid" for reference etc, i thought i'd make one,
So picked some "materials", about 15 models, and added age to the prompt to avoid the pitfall of not including certain negatives.
Unfortunately i won't post any of them, and it's not just because of the +300mb grid image being too large, but despite the age this seemed to make almost all models great every young faces, with the exception of wood/branches, where it got much older (probably because it has a wrinkly look). So that was 5h and >1000 images to instantly delete...
Running it again now with "older" prompt and negatives, hoping something can be shared. Already had to restart multiple times as some seeds seems to do weird shit
 

me3

Member
Dec 31, 2016
316
708
Stealing ppl's sht:

View attachment 2785561

Credit:

Oh, so there is this model I never heard of: . Interesting.
Model looks interesting and Juggernaut sounds worth trying too if it really is a base model, there's more than enough merges ripping off others so can be nice to have something that might behave different
 

me3

Member
Dec 31, 2016
316
708
Grid done...finally, can't really find anywhere to host the 350mb full version and i've had to split the jpg one.
Think it should be possible to tell what's clearly not worked and what's at least partially done so. There does seem to be a pattern of what "material" that has a better success than others, and what models clearly doesn't.

The radiant vibes one has me rather concerned. I guess it illustrates what i mentioned in the previous post. All other models seem pretty consistent with following the age in prompt, so wtf happened there i have no idea, i guess the only upside is that it's faces and fully clothed, still, don't like it having happened.

I've included one image "in full" for prompt and "it didn't turn out all that bad". Let me know if there's any others that i should upload.

split (1).jpg
split (2).jpg
split (3).jpg


00276-99743560.png
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,531
3,618
Model looks interesting and Juggernaut sounds worth trying too if it really is a base model, there's more than enough merges ripping off others so can be nice to have something that might behave different
Indeed. What caught my attention and made me look at the Juggernaut is the girl's angle in the image above. Most of the models I love do struggle with that very angle. So, even if the model is so-so in every other aspect, at least it works as a dedicated tool for the rear shots.

Also, I went through Elgance -> Deliberate -> Clarity -> Zovya's Photoreal because each had an incremental improvement in some aspects that resonated with me. Say I like chonky milfs, so I saw incremental improvements. But your mileage might vary if you are into something else or if your ideas of CMs are different.

I will be trying out Juggernaut for the next few weeks, keeping fingers crossed it comes thru aright. Here testing Juggernaut with the OpenPose ControlNet, the results are pretty pleasing, except the woman aint blonde, her hair aint short, the pants are not jean shorts, and the top doesn't have underboob.

a_13356_.png
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,794
Grid done...finally, can't really find anywhere to host the 350mb full version and i've had to split the jpg one.
Think it should be possible to tell what's clearly not worked and what's at least partially done so. There does seem to be a pattern of what "material" that has a better success than others, and what models clearly doesn't.

The radiant vibes one has me rather concerned. I guess it illustrates what i mentioned in the previous post. All other models seem pretty consistent with following the age in prompt, so wtf happened there i have no idea, i guess the only upside is that it's faces and fully clothed, still, don't like it having happened.

I've included one image "in full" for prompt and "it didn't turn out all that bad". Let me know if there's any others that i should upload.

View attachment 2785729
View attachment 2785730
View attachment 2785731


View attachment 2785732
Yeah what's up with that vibrant checkpoint?.. Creepy to think about, why it might give that result. You mentioned seeds. I have been curious about, if there is any rime or reason to wich seeds we use. If lets say a higher number has any relevance to the outcome etc. Or is it all just random? I have read people say this or that about it. Lower is better for cartoon or anime etc and higher is better for photo realism. I have no idea if it's just people imagining things or if there is something to it.
Awesome work on this huge test.:)(y)
 

me3

Member
Dec 31, 2016
316
708
Indeed. What caught my attention and made me look at the Juggernaut is the girl's angle in the image above. Most of the models I love do struggle with that very angle. So, even if the model is so-so in every other aspect, at least it works as a dedicated tool for the rear shots.

Also, I went through Elgance -> Deliberate -> Clarity -> Zovya's Photoreal because each had an incremental improvement in some aspects that resonated with me. Say I like chonky milfs, so I saw incremental improvements. But your mileage might vary if you are into something else or if your ideas of CMs are different.

I will be trying out Juggernaut for the next few weeks, keeping fingers crossed it comes thru aright. Here testing Juggernaut with the OpenPose ControlNet, the results are pretty pleasing, except the woman aint blonde, her hair aint short, the pants are not jean shorts, and the top doesn't have underboob.

View attachment 2785760
Is this the sort of thing you're after, tried to keep the grid at a fairly decent size for viewing pleasure...
xyz_grid.jpg

Images for prompt and ...

Title: Stalking successful
01049-3309730544.png

Title: Stalking failed, abort abort abort.....RUN!
00997-3345252563.png
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,794
I have been getting a private lesson 1 on 1 with instructor Kendra..
00023-2715484978.png

Finally I got into controlnet and open pose after procrastinating for a long time. I just thought it looked busy or a bit involved so I was sticking to what I knew and focused on other aspects of SD. In the pursuit of generating widescreen images, I learned that it was probably controlnet and it's complimentary extensions that was the answer. I first learned the method of "outpainting", meaning first generating a normal upright portrait ratio image and then with SD upscale and the "resize fill" option selected, then "outpaint" the rest with controlnet inpaint. This did the trick but was hit and miss. It was difficult to get it to blend well with the original, you always get a seem between the two. I learned from Sebastian Kamph to then do a normal img2img generation. This will blend the two together and then you can uspcale it. During my research I came across a different method however, that excludes the need for any "outpainting". You will instead use the latent couple extension in txt2img. With it you can assign a part of the prompt to a specific region of the image.
If you want a normal 16:9 ratio widescreen image this division and settings (se example) has been working the best for me.

Latent couple.png

You will separate the prompt with "AND" for each region. I write all the light and image quality tags for the first region, the subject tags for the 2nd and the background and/or scenery for the 3rd.
Here's how a prompt can look like:

Widescreen prompt.png

If you are going to use a Lora like me you also need to use the extension "Composable Lora".
You can also assign the negative prompting in the same way to each region by separating with "AND", though it's not always necessary. Use the same value for "end at this step" as your sample steps.
You can move the subject within the image by changing the position value for the 2nd region, 0:0.7 for example.
This will shift it off center in the image. Then press "visualize" to apply the new setting.

Latent couple2.png

Set the resolution of the entire image in text2img, for example 960x540, write your prompt and separate the regions with "AND" , also the negative prompt in the same way if needed.
Select your sampler and steps and cfg etc like normal, Setup the Latent couple extension settings and Composable Lora then generate.
To take it even further you can also use open pose to control the pose of the subject and to bump up the quality you can either use hiresfix with the primary generation or SD upscale script in img2img.

Source tutorial:
 
Last edited:

FallingDown90

Member
Aug 24, 2018
115
38
Hi, sorry I'm not English and it's hard to find the answer if you've already written it.
What is the way to generate specific anime characters? (I use Stable Diffusion with LoRa)