[Stable Diffusion] Prompt Sharing and Learning Thread

Synalon

Member
Jan 31, 2022
225
663
Two images I made using Forge, its pretty much the same as Automatic1111 but works a bit faster. I guess its better optimized, as its far more stable at least for me.

It's fairly easy to copy over the models from Automatic1111 as well, but if you want to keep both you can tell it to use models from Automatic1111 in the settings tab. You will also be able to use the models for Controlnet the same way, and this also works for After Detailer.

It doesn't work this way for upscalers as far as I'm aware though, and I'm still learning the other integrated extensions.

wide1.jpg

wide2.jpg
 

me3

Member
Dec 31, 2016
316
708
Having been messing around with cascade, i'll share some observations etc. (badly formatted and written wall-of-text warning)

I've tried both the "comfyui checkpoints" and the unet models and from what i've seen so far something seem "wrong" with the comfyui ones. This could be an issue in the creation or just how things are loaded, i don't know. Results are generally "bad", faces seem very low quality, proportions are off, i had one image where it looked like the head was like half the size you'd expect it to be compared to the size of the upper body. There's quite a bit of blurring/out of focus, cropping/out of frame. It kind reminds me of SD early on with very small dimensions.
It also has a very high memory issue. While it does have a low vram usage during generation, initial loading is really bad. The two model files are about 13-14gb combined but it REALLY struggled fitting that into 32gb of RAM.
I have no problem caching multiple sd and sdxl models in RAM normally, but this really did not do well. I'm sure they'll work on some optimizing, but atm it's not doing all that well.

The "unet version" seem to work far better, both in memory and generation. At least for me, They have multiple versions and i didn't notice all that much difference in the results they gave, so for testing i'd say you're fairly safe at going with the smaller ones.
Generated images are fairly clean with little/no negative prompts, using :x.y style weighting seems to completely break the image though so keep that in mind if getting bad results.
Another problem is that RNG seems to be tied to gpu, at least there's a problem recreating other ppls images which makes me think it's linked to gpu. (this might be talked about in the various papers, i haven't read them). I've tried with quite a few images, using the included workflow, models etc, but it's not even close to the same results.
This "version"/method does use even less vram though, how much depends on compression, but as an example, currently running a 1840x3072 "image" and the sampling is using around 2-3gb. The latent -> image decoding obviously has a higher usage, but tiled version can fix that if needed. Prompt exec time was 237 sec, so at that size, 40 steps and decoding, that is fairly good. I doubt i'd be able to run an image at that starting size in sd or sdxl to compare.

As it stands, i'd probably let them finish things and work out the bugs and issues before committing too much to using it. It's fairly new and in "early access", and it shows. I'm sure there's issues to be worked out both in the core and in the various implementations before things work as wanted.

Seems like some are using it to create "base" images and then using existing sd/sdxl models and extensions to refine/fix the images.
Which already being done with sd and xl (both ways) to take advantage of some model or extension in one to improve/control the result of the other.
 

hkennereth

Member
Mar 3, 2019
237
775
so the new version coming has even more restrictions. sounds like spyware is coming. how sway :rolleyes:

I will wait before taking my pitchfork out until I know what safeguards are those. People are quick to start freaking out, but not a single one of those already raging knows what does "safeguard" even mean in this context, specifically. The very short announcement post didn't say a single thing about "potentially limiting people's ability to churn out NSFW content, including porn", that's just an assumption.
 

modine2021

Member
May 20, 2021
417
1,389
I will wait before taking my pitchfork out until I know what safeguards are those. People are quick to start freaking out, but not a single one of those already raging knows what does "safeguard" even mean in this context, specifically. The very short announcement post didn't say a single thing about "potentially limiting people's ability to churn out NSFW content, including porn", that's just an assumption.
what pitchfork? freak out? never. unimportant. i posted an article about the new one. and made a quick reply. and they ARE going to make it difficult to create nsfw. either way, i will continue to use the current should they make good on their word. or use the numerous other models *shrugs*
 

hkennereth

Member
Mar 3, 2019
237
775
what pitchfork? freak out? never. unimportant. i posted an article about the new one. and made a quick reply. and they ARE going to make it difficult to create nsfw. either way, i will continue to use the current should they make good on their word. or use the numerous other models *shrugs*
I didn't say you were freaking out, I was pointing out that people ARE freaking out, for example the article you posted. And you don't KNOW they will make difficult to make NSFW, you're just ASSUMING that. That may very well be the case, but you have at the moment ZERO evidence to support this argument. All I'm saying is that, personally, I will wait until I have the facts before jumping to conclusions.

It just so happens that the same things were said when they first announced SD2.0... and when they announced SDXL. The same "doomsday, OMG they are coming for our noods!!" BS was spilled over those models, and they were completely wrong because it turns out people could just fine tune those models like always. Will that continue to be the case? I don't know... but neither do you, neither does the author of that article, and neither do all the people yelling on Reddit that the article reposted. Everyone read the same announcement post that just says "we'll have safeguards"... not a single one of those people know what safeguards are there, about what, how do they work, or if they can be circumvented by fine tuning new models from the base SD3.0 models... like it was done for all models before. So... how about we all just wait, like the smart, rational people we all are?
 

JValkonian

Member
Nov 29, 2022
285
256
Just want to say to everyone who wrote up instructions and the person who pinned the links up top, thanks. Last night I installed SD just to fool around and some fun. This is really cool stuff. Great job putting everything together, thanks again
 
Last edited:

modine2021

Member
May 20, 2021
417
1,389
my PNGInfo send to txt2img img2im etc. quit working. it clicks but nothing shows up in prompts etc etc. no errors in console show either. so idea what causing it. im tired of copying and pasting. so i need this working
 

Pr0GamerJohnny

Conversation Conqueror
Sep 7, 2022
6,582
9,821
I skimmed over the training post linked in the lede at https://f95zone.to/threads/stable-diffusion-prompt-sharing-and-learning-thread.146036/post-10306324..

One thing that's not clear to me is if it's viable to feed results of a batch of generated images back into a training program - and have those be the sole source of information. My main thinking on this is while random pictures if done well are nice, in order to have any application for games there needs to be a way to generate consistent characters - e.g. all generated images for a given set resemble this same girl you started with.

Does that make sense or has anyone done something like this? Are there tips like once you have a base face, you train it with 1000 images of that very same face (which have in turn be generated from a prior session?) And if this works, does that mean one needs to repeat the training process every time they have a new character, or is there a way to associate faces you've created with prompts?
 
  • Like
Reactions: Delambo

Sharinel

Active Member
Dec 23, 2018
598
2,511
Does that make sense or has anyone done something like this? Are there tips like once you have a base face, you train it with 1000 images of that very same face (which have in turn be generated from a prior session?) And if this works, does that mean one needs to repeat the training process every time they have a new character, or is there a way to associate faces you've created with prompts?
There is a cooperative drive to make a Lora from Wildeer's Lara Croft here which does something similar - https://f95zone.to/threads/loras-for-wildeers-lara-croft-development-thread.173873/

I also did one which worked enough for my own needs. What I did was use a pic of a render that I quite liked as a base and ran it through Reactor which is an extension for Automatic1111. It overlays the base face into any subsequent pics. So I made 20 pics using that base, and then used those 20 as a training set to make a Lora. Worked quite well
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
my PNGInfo send to txt2img img2im etc. quit working. it clicks but nothing shows up in prompts etc etc. no errors in console show either. so idea what causing it. im tired of copying and pasting. so i need this working
I have had this also. It usually works again if you reload the image, meaning exit the image and load it again. This can happen when you try to load an image in png info while generating at the same time. If it still doesn't work then reload the ui, next is restarting the bat file. Update etc. And last is reboot your pc. I doubt you'll need to go that far though.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
I skimmed over the training post linked in the lede at https://f95zone.to/threads/stable-diffusion-prompt-sharing-and-learning-thread.146036/post-10306324..

One thing that's not clear to me is if it's viable to feed results of a batch of generated images back into a training program - and have those be the sole source of information. My main thinking on this is while random pictures if done well are nice, in order to have any application for games there needs to be a way to generate consistent characters - e.g. all generated images for a given set resemble this same girl you started with.

Does that make sense or has anyone done something like this? Are there tips like once you have a base face, you train it with 1000 images of that very same face (which have in turn be generated from a prior session?) And if this works, does that mean one needs to repeat the training process every time they have a new character, or is there a way to associate faces you've created with prompts?
The best approach imo is what Sharinel described. Lora's are nice when you need them but they are becoming more and more obsolete. With Reactor you can make a face model also, it is simply compiling the batch of images you feed it and save it as a safetensor file. Next time you only need to select this model in your list rather than navigate yourself to an image. There is also an alternative to reactor, faceswaplabs. With this extension you can also make a facemodel and in addition you get a preview of the face model you have created, and can exclude or add an image to get a better result or improved likeness.
 
Last edited:

hkennereth

Member
Mar 3, 2019
237
775
The best approach imo is what Sharinel described. Lora's are nice when you need them but they are becoming more and more obsolete.
I would have to disagree with that statement. As someone who basically focus on creating images of real people all the available face replacement technique, like Reactor, are very limited to achieve good likeness because they don't affect head shape and not everyone has the similarly shaped heads, all they do is change eyes, noses, and mouths; furthermore, they won't do anything about body shape. You can get images that look a bit like someone with these techniques, but if you want accurate representations of specific people, there are no replacements for LoRAs or Dreambooth models yet.
 
  • Like
Reactions: dildo88 and Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,571
3,768
I would have to disagree with that statement. As someone who basically focus on creating images of real people all the available face replacement technique, like Reactor, are very limited to achieve good likeness because they don't affect head shape and not everyone has the similarly shaped heads, all they do is change eyes, noses, and mouths; furthermore, they won't do anything about body shape. You can get images that look a bit like someone with these techniques, but if you want accurate representations of specific people, there are no replacements for LoRAs or Dreambooth models yet.
Without getting into the discussion, but rather only focusing on this snipper: "they won't do anything about body shape" I'd say if I had to nail the body shape, I'd go with ControlNet for Depth/Canny or even Tile. Then snap Reactor/whatnot and there's your [famous person] doing [consipracy theory].

I developed a distinct dislike of LORAs for they are fucking witch brews that completely fuck with one's mind because of how uncertain the training outcome is. I mean this in a supportive way, this is not a criticism of anyone, thank God we have these. Rather a statement that due to me being fucking inept at things, LORAs are a miss for me.

Now, for real I meant to ask if you or if anyone else tried InstaID ?

I am procrastinating as usual cause FML. So, wanted to ask if folks go "yey!" or "fuck no" over it.
 
  • Like
Reactions: Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
Without getting into the discussion, but rather only focusing on this snipper: "they won't do anything about body shape" I'd say if I had to nail the body shape, I'd go with ControlNet for Depth/Canny or even Tile. Then snap Reactor/whatnot and there's your [famous person] doing [consipracy theory].

I developed a distinct dislike of LORAs for they are fucking witch brews that completely fuck with one's mind because of how uncertain the training outcome is. I mean this in a supportive way, this is not a criticism of anyone, thank God we have these. Rather a statement that due to me being fucking inept at things, LORAs are a miss for me.

Now, for real I meant to ask if you or if anyone else tried InstaID ?

I am procrastinating as usual cause FML. So, wanted to ask if folks go "yey!" or "fuck no" over it.
I messed with it a bit but since it was at the time only for SDXL it didn't do much for me. My old card can't really handle SDXL. Insta id is just a different flavor of ip adapter. The basemodel has support for sd1.5 but at the moment I tried it the extension didn't. This might already be corrected as I have been living under a rock for a bit I got no clue.
 
  • Like
Reactions: Sepheyer

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
I would have to disagree with that statement. As someone who basically focus on creating images of real people all the available face replacement technique, like Reactor, are very limited to achieve good likeness because they don't affect head shape and not everyone has the similarly shaped heads, all they do is change eyes, noses, and mouths; furthermore, they won't do anything about body shape. You can get images that look a bit like someone with these techniques, but if you want accurate representations of specific people, there are no replacements for LoRAs or Dreambooth models yet.
Yes it's true that faceswapping leaves out the body of course but with different techniques you can get that pretty close. The issue with trained models such as loras is that they are notoriously inconsistent and not at all straight forward to create good ones. It takes a lot of time and dedication to achieve a decent one and most people doesn't have the fortitude that it takes, so places such as civitai are filled with utter garbage loras. Unless you are completely anal about it you can get close enough with faceswap and then work on the body separately with controlnet or other methods. This is imo much more consistent and easier to get right than loras. Every person is of course free to choose their own path. This is only my own experience and opinion, though many would agree with me clearly not everyone. :p
 
  • Like
Reactions: Sepheyer

hkennereth

Member
Mar 3, 2019
237
775
Without getting into the discussion, but rather only focusing on this snipper: "they won't do anything about body shape" I'd say if I had to nail the body shape, I'd go with ControlNet for Depth/Canny or even Tile.
The problem with that technique is only being able to generate basically a copy of another picture in a different style. I want to make new images of people in different poses and situations, and not just recreate existing images of that person. That's why loras work better for me.

It takes a lot of time and dedication to achieve a decent one and most people doesn't have the fortitude that it takes, so places such as civitai are filled with utter garbage loras.
I absolutely agree, training good loras is hard and it's very rare to find good ones on Civitai. But that's why I train my own, or have them trained for me by a friend with a more powerful GPU who happens to find the process fun. I currently have about 80 loras of different cosplayers and instagram models of whom I make images of as my past time. Many (if not most even) of these girls have very distinct faces and head proportions that would be near impossible to recreate a good likeness of simply by replacing their face in a different picture, and I often make full body pictures of them, and I don't want their face in a generic woman's body, I want to see their body recognizable as well. Loras are still the best way to achieve that, without question; if there was a better way, I would be using it :p
 
  • Like
Reactions: dildo88 and Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
The problem with that technique is only being able to generate basically a copy of another picture in a different style. I want to make new images of people in different poses and situations, and not just recreate existing images of that person. That's why loras work better for me.



I absolutely agree, training good loras is hard and it's very rare to find good ones on Civitai. But that's why I train my own, or have them trained for me by a friend with a more powerful GPU who happens to find the process fun. I currently have about 80 loras of different cosplayers and instagram models of whom I make images of as my past time. Many (if not most even) of these girls have very distinct faces and head proportions that would be near impossible to recreate a good likeness of simply by replacing their face in a different picture, and I often make full body pictures of them, and I don't want their face in a generic woman's body, I want to see their body recognizable as well. Loras are still the best way to achieve that, without question; if there was a better way, I would be using it :p
Don't get me wrong I love using loras too. They are just very inconsistent. I think it's all about the use case scenario though. If you are happy with the result you get, that's all that matters. One thing doesn't exclude the other though. You can use both, or neither.
It's very difficult to get perfect likeness with a lora in my experience no matter how good it is. With a faceswap you get closer though. There is a different option that has not been mentioned and that is outpainting. Essentially only re generating the face from an image and outpainting the rest. You can just mask the face and select inpaint not masked. Have you tried to use a good lora and then use a faceswap over it? This would make use of the bone structure from the lora and get the better likeness from the faceswap. Essentially using the best from both. You can use real images with openpose etc for the different scenarios.
There are always new things to try. I think the fact that SD is not perfect is one of the things that keeps it interesting. If it were easy there wouldn't be any "sport"..