[Stable Diffusion] Prompt Sharing and Learning Thread

me3 · Sep 30, 2023

You should be able to do both 4 and 8k natively just fine, both with and without upscaling, at very little to no loss.
I can't remember what it's for XL as that is far beyond what i could even hope to work with so i've not been digging for info about it.
Also interested in seeing how Würstchen will work and do.

On a sidenote, that lora seem to not care about likeness, but considering they don't seem to have cropped out the text in the training data it's probably too big an ask to make it look like the actual person.

felldude · Oct 1, 2023

me3 said:
You should be able to do both 4 and 8k natively just fine, both with and without upscaling, at very little to no loss.
I can't remember what it's for XL as that is far beyond what i could even hope to work with so i've not been digging for info about it.
Also interested in seeing how Würstchen will work and do.

On a sidenote, that lora seem to not care about likeness, but considering they don't seem to have cropped out the text in the training data it's probably too big an ask to make it look like the actual person.

XL was trained at 1024x1024 it appears Würstchen V2.0 is also. According to their paper they have generated 1024x2048

me3 · Oct 1, 2023

I know what they were trained at, not what i meant.
Anyway with Würstchen i was more thinking about their claim of being much faster than XL and what that means for memory usage and compatibility with older system.
Same with AITemplate i guess, but i can't set that up because ppl don't bother buildiing things well, unfortunately.

felldude · Oct 1, 2023

me3 said:
I know what they were trained at, not what i meant.
Anyway with Würstchen i was more thinking about their claim of being much faster than XL and what that means for memory usage and compatibility with older system.
Same with AITemplate i guess, but i can't set that up because ppl don't bother buildiing things well, unfortunately.

Read both papers and draw your own conclusion. Both models are limited to about 2x the trained latent image generation size even when finetuned.

You must be registered to see the links

Almost all the models self evaluate or AI evaluate against recreation of the FID-COCO-30k set of images

Synalon · Oct 1, 2023

Random test I did using the same prompt with different models. The pose is kept the same using controlnet.
The pictures are named for the model used.

You don't have permission to view the spoiler content. Log in or register now.

sharlotte · Oct 5, 2023

After a week mucking about with ComfyUI and not getting any good results with faces, went back to SD and already much better:

hkennereth · Oct 6, 2023

Thought I could share a tip here for anyone looking to create consistent-looking characters without having to rely on LoRAs, using just some openly available checkpoint and prompting: if you ask for a mix of a few known celebrities, SD will create a person that merges the facial features of all of them into a "new" person, but this way you can pretty consistently output that same person.

For example, on every image below I had mix of Sarah Shahi and Vanessa Hudgens and Nina Dobrev as part of the prompt, with a few differences on the rest of the prompt for each picture to describe clothing, visual style, etc. Hope this helps (and that it wasn't already explored on the previous 100-something pages).

Mr-Fox · Oct 7, 2023

hkennereth said:
Thought I could share a tip here for anyone looking to create consistent-looking characters without having to rely on LoRAs, using just some openly available checkpoint and prompting: if you ask for a mix of a few known celebrities, SD will create a person that merges the facial features of all of them into a "new" person, but this way you can pretty consistently output that same person.

For example, on every image below I had mix of Sarah Shahi and Vanessa Hudgens and Nina Dobrev as part of the prompt, with a few differences on the rest of the prompt for each picture to describe clothing, visual style, etc. Hope this helps (and that it wasn't already explored on the previous 100-something pages).

View attachment 2983515 View attachment 2983514 View attachment 2983517 View attachment 2983518 View attachment 2983519 View attachment 2983520 View attachment 2983521 View attachment 2983522 View attachment 2983523

You can also do this either with tag mixing or keyword weighting.

Tag mixing:

As you can see it's essentially like using a "refiner". The first person is the main checkpoint and the second is the "refiner" but the weight is reversed. This means that if you want the "refiner" to have more impact you need to increase the number. If you want the checkpoint to have more power the opposite is true. Don't use "[ ]", use normal brackets "( )" instead. It tend to give you and error with SD if you use "[ ]" .

Keyword weighting:

This can be used for blending more than 2 faces.
(Emma Watson:0.5), (Tara Reid:0.9), (Ana de Armas:1.2)

Source:

You must be registered to see the links

Mr-Fox · Oct 7, 2023

Bonus tips..

You can generate a portrait with either SD1.5 or XL and let the SD only focus on the face regardless of being a mix or not. This will give the face more detail. Then you simply use this face with the extension "roop" when you are generating your character. You can also use this method with img2img to make "photobashing". Meaning taking a photo and using "roop" on top of the photo to create fakes.
When using img2img roop will not be part of the generative process, it will only paste the face on top of the existing bone structure. This will obviously not give a good result in all scenarios. For best result use this method with txt2img when generating a new image.

Face:

You don't have permission to view the spoiler content. Log in or register now.

Character using roop:

Using a photoshoped photo of Angelina with blonde hair in roop:

You don't have permission to view the spoiler content. Log in or register now.

Without roop:

hkennereth · Oct 7, 2023

Mr-Fox said:
You can also do this either with tag mixing or keyword weighting.

Tag mixing:

View attachment 2987332
As you can see it's essentially like using a "refiner". The first person is the main checkpoint and the second is the "refiner" but the weight is reversed. This means that if you want the "refiner" to have more impact you need to increase the number. If you want the checkpoint to have more power the opposite is true. Don't use "[ ]", use normal brackets "( )" instead. It tend to give you and error with SD if you use "[ ]" .

Keyword weighting:

This can be used for blending more than 2 faces.
(Emma Watson:0.5), (Tara Reid:0.9), (Ana de Armas:1.2)
View attachment 2987341

Source:

You must be registered to see the links

The tag mixing shown on your first image does work, however it is as far as I know a feature of Automatic1111, and it's either not supported or works differently on other UIs since it changes how the diffusion process works on that image. ComfyUI, my app of choice, doesn't really support that, and I didn't find that trying to change weights works as reliably as I'd like, the results are not as consistent across a wide range of images, which is the point of my original post.

The method I suggested is more flexible and works on any image generation app since it's just basic prompting. It doesn't allow the same level of control, of course, but it is in my experience better for getting a consistent "new" person across many images even when changing styles or checkpoints. Just something to keep in mind.

Mr-Fox · Oct 7, 2023

hkennereth said:
The tag mixing shown on your first image does work, however it is as far as I know a feature of Automatic1111, and it's either not supported or works differently on other UIs since it changes how the diffusion process works on that image. ComfyUI, my app of choice, doesn't really support that, and I didn't find that trying to change weights works as reliably as I'd like, the results are not as consistent across a wide range of images, which is the point of my original post.

The method I suggested is more flexible and works on any image generation app since it's just basic prompting. It doesn't allow the same level of control, of course, but it is in my experience better for getting a consistent "new" person across many images even when changing styles or checkpoints. Just something to keep in mind.

You get the most consistent result with something like roop or controlnet models since it's the same input every time. I don't know if these are available for comfyui though.

hkennereth · Oct 8, 2023

Mr-Fox said:
You get the most consistent result with something like roop or controlnet models since it's the same input every time. I don't know if these are available for comfyui though.

I don't think that ControlNet is really a good solution for this particular problem, as it allow following a pre-existing composition, but not create a new image only by prompt with the flexibility that this gives as far as completely different result each time you run the prompt. For example, those images of the girl on the bar on my original example were the result of the exact same prompt, I just asked it to generate X images, without needing to prepare any source images for it. That said, if you do want a specific composition, you can surely use ControlNet in addition to the technique above, and yes, ControlNet is available on ComfyUI.

The best alternative is really using a LoRA or Dreambooth so you can train a model to create images of that specific person, but that is better used when you want to reproduce a pre-existing person, not a new fictional one. So if you want to make a game with Angelina Jolie as your main character, training a LoRA or Dreambooth model of her would be the best solution for sure. But that does require a lot of work. My suggestion, or yours of using tag mixing, are better when you want to create a new character "from scratch", and you're just providing some "DNA" to help direct the prompt make that same character consistently.

I know nothing about "roop", so I can't really speak much to it as a good alternative here, but from a very quick look at their GitHub page it also seems better for cases where you're trying to reproduce an existing real person, not create one from scratch. Please correct me if I'm wrong.

Mr-Fox · Oct 8, 2023

hkennereth said:
I don't think that ControlNet is really a good solution for this particular problem, as it allow following a pre-existing composition, but not create a new image only by prompt with the flexibility that this gives as far as completely different result each time you run the prompt. For example, those images of the girl on the bar on my original example were the result of the exact same prompt, I just asked it to generate X images, without needing to prepare any source images for it. That said, if you do want a specific composition, you can surely use ControlNet in addition to the technique above, and yes, ControlNet is available on ComfyUI.

The best alternative is really using a LoRA or Dreambooth so you can train a model to create images of that specific person, but that is better used when you want to reproduce a pre-existing person, not a new fictional one. So if you want to make a game with Angelina Jolie as your main character, training a LoRA or Dreambooth model of her would be the best solution for sure. But that does require a lot of work. My suggestion, or yours of using tag mixing, are better when you want to create a new character "from scratch", and you're just providing some "DNA" to help direct the prompt make that same character consistently.

I know nothing about "roop", so I can't really speak much to it as a good alternative here, but from a very quick look at their GitHub page it also seems better for cases where you're trying to reproduce an existing real person, not create one from scratch. Please correct me if I'm wrong.

If you have not tried the tools or techniques I'm talking about then you can't know the potential. I would suggest that you try them before writing it off or assuming the best use case scenario. Just to be clear I'm not yelling at you, I'm only stating my thoughts and opinions.

Roop is very useful for creating a character from scratch also, not only to make fakes of real people. I am suggesting that if you generate a face only, SD will have more resources to give it more detail and quality, then you can use roop with this generated face to generate the entire character with body. In this scenario you would get much nicer faces with nice looking bodies as well. Ofc you can use any tag mixing or other method that works for you when you generate the face portrait. The new controlnet has a model named ip adapter. With this you can use a character that you have created and then simply change the composition or pose, this makes it very consistent and you can make your dataset for a Lora or checkpoint this way. I think it is available for comfy also. Openpose has a face only model, with this you will only give SD the bare bones so to speak, SD will still generate different results with every seed. Or you can of course use the full openpose model and get the same but with the body also. These tools can be used in many ways. They will not limit you, it's only about how creative you can be in using the tools and how imaginative you are.

Here's a demo/tutorial video for Ip adapter:

You must be registered to see the links

hkennereth · Oct 9, 2023

Mr-Fox said:
If you have not tried the tools or techniques I'm talking about then you can't know the potential. I would suggest that you try them before writing it off or assuming the best use case scenario. Just to be clear I'm not yelling at you, I'm only stating my thoughts and opinions. Roop is very useful for creating a character from scratch also, not only to make fakes of real people. I am suggesting that if you generate a face only, SD will have more resources to give it more detail and quality, then you can use roop with this generated face to generate the entire character with body. In this scenario you would get much nicer faces with nice looking bodies as well. Ofc you can use any tag mixing or other method that works for you when you generate the face portrait. The new controlnet has a model named ip adapter. With this you can use a character that you have created and then simply change the composition or pose, this makes it very consistent and you can make your dataset for a Lora or checkpoint this way. I think it is available for comfy also. Openpose has a face only model, with this you will only give SD the bare bones so to speak, SD will still generate different results with every seed. Or you can of course use the full openpose model and get the same but with the body also. These tools can be used in many ways. They will not limit you, it's only about how creative you can be in using the tools and how imaginative you are.

Here's a demo/tutorial video for Ip adapter:

You must be registered to see the links

Of course. Out of those only Roop is the one I'm not familiar with, mostly because I don't really use A1111 anymore, and I don't think it's available for ComfyUI. But I am familiar with IP Adapter, and while I haven't been making a ton of art lately, I have had the chance to play around with it and found that it has a ton of potential. The images below were made with it, and a SD1.5 model.

Deleted member 1666680 · Oct 9, 2023

Do I have to pay for stable diffusion membership in order to do all of that or is downloading the github stuff alone enough already?

me3 · Oct 9, 2023

It's free.
Not sure which UI/system you're planing to use but both

You must be registered to see the links

and

You must be registered to see the links

is pretty easy to get to run and use, a1111 being the simplest to use of the two
There are those that sell models and loras, but that really isn't worth even considering for 99% of the usage. You can get very good models on sites like

You must be registered to see the links

, same with loras. There's obviously badly made ones too but you usually notice those either by images, comments and amounts of downloads.

Deleted member 1666680 · Oct 9, 2023

me3 said:
It's free.
Not sure which UI/system you're planing to use but both comfyui and automatic1111 is pretty easy to get to run and use.
There are those that sell models and loras, but that really isn't worth even considering for 99% of the usage. You can get very good models on sites like
You must be registered to see the links
, same with loras. There's obviously badly made ones too but you usually notice those either by images, comments and amounts of downloads.

So basically it's free because I have to "train" it by myself first? Or are models some kind of pre-trained modules?

Jimwalrus · Oct 9, 2023

Fuchsschweif said:
So basically it's free because I have to "train" it by myself first? Or are models some kind of pre-trained modules?

SD is completely free because Hugging Face released it as such!
I know, right?
The only possible expense* with using Civitai is some creators will set their models as 'Early Release' for which you have to be a paid member to access for a small number of days.
No training is required to use SD, including any of the models on Civitai or elsewhere (they are, as you put it "pre-trained modules"), but you're free to follow tutorials and do some training if you wish. It's also possible you'll want something that no-one else has trained yet.

*Electricity bills aside - they shouldn't be too much unless you have a crazy multi-GPU set up

Deleted member 1666680 · Oct 9, 2023

Jimwalrus said:
SD is completely free because Hugging Face released it as such!
I know, right?
The only possible expense* with using Civitai is some creators will set their models as 'Early Release' for which you have to be a paid member to access for a small number of days.
No training is required to use SD, including any of the models on Civitai or elsewhere (they are, as you put it "pre-trained modules"), but you're free to follow tutorials and do some training if you wish. It's also possible you'll want something that no-one else has trained yet.

*Electricity bills aside - they shouldn't be too much unless you have a crazy multi-GPU set up

I somehow thought SD is Dalle2 and since they charge premium that there is no free version. Or is Dalle just another big model that gives you external rendering, making use of SD, therefore charge you?

I've got a GTX 1070, does it make sense to generate pictures with that or will it take an eternity to get pictures generated?

Thanks for all the info

me3 · Oct 9, 2023

Fuchsschweif said:
I somehow thought SD is Dalle2 and since they charge premium that there is no free version. Or is Dalle just another big model that gives you external rendering, making use of SD, therefore charge you?

I've got a GTX 1070, does it make sense to generate pictures with that or will it take an eternity to get pictures generated?

Thanks for all the info

Considering i'm using a 1060 (and 1050 in some cases), you should to pretty fine on a 1070

[Stable Diffusion] Prompt Sharing and Learning Thread

Member

Active Member

Member

Active Member

Member

Member

Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Well-Known Member

Well-Known Member

Member