[Stable Diffusion] Prompt Sharing and Learning Thread

Sepheyer

Well-Known Member
Dec 21, 2020
1,571
3,768
So, after a first few trials of the IP Adapter I am realizing the thing ain't quite what it was described as in the videos. Namely videos handled abstract scenarios, where applying the same tool to the real life problem of say generating full-height character is not there yet. So, imma be trying a bit more, aight. The goal is really to transfer Lara Croft's face to a full body person. In more scientific terms, I am hoping to inject LC face latent into the model stream, just like Open Pose control net injects its pose into the model stream. Naturally, the next few days are gonna be: tests,tests, tests. Meanwhile, if anyone is into this, here is the source image and the resulting images (they are CUI workflows as well):
You don't have permission to view the spoiler content. Log in or register now.
a_01400_.png
You don't have permission to view the spoiler content. Log in or register now.
Like, this is not horrible, this is pretty good. For imaginary uses, like character aging over time, this is fantastic. I.e. the facial feature shapes are identical from picture to picture. I would completely believe that this is one and the same person at different phases of her life.

It is just - liekly due to me just starting the tests and being a moron on top of it - completely useless for my purposes.

Thinking aloud: I probably need to try the other workflows along the img2img pipeline. Yea, I'll head there once I snatch my motivation from the jaws of procrastination.
 
Last edited:

Sepheyer

Well-Known Member
Dec 21, 2020
1,571
3,768
Tho... Great tool for bringing your DAZ/HS2/VAM waifus to life:

Source: Honey Select 2 renderSampler: Euler Ancestral
Woman, Teenager
Sampler: Euler Ancestral
Woman, Milf
HS2_2023-09-06-05-48-24-333.png a_01417_.png a_01419_.png
Sampler: Heun
Woman, Supermodel
Sampler: Heun
Woman
Sampler: Heun
Woman, MILF
a_01453_.png a_01433_.png a_01441_.png
Sampler: Heun
Various embellishments
Sampler: Heun
Different mixing proportions
Sampler: Heun
Different mixing proportions
a_01458_.png a_01509_.png a_01505_.png
Sampler: Heun
Different mixing proportions
Sampler: Heun
Different mixing proportions
Sampler: Heun
Different mixing proportions
a_01504_.png a_01503_.png a_01500_.png
 
Last edited:

me3

Member
Dec 31, 2016
316
708
So, after a first few trials of the IP Adapter I am realizing the thing ain't quite what it was described as in the videos. Namely videos handled abstract scenarios, where applying the same tool to the real life problem of say generating full-height character is not there yet. So, imma be trying a bit more, aight. The goal is really to transfer Lara Croft's face to a full body person. In more scientific terms, I am hoping to inject LC face latent into the model stream, just like Open Pose control net injects its pose into the model stream. Naturally, the next few days are gonna be: tests,tests, tests. Meanwhile, if anyone is into this, here is the source image and the resulting images (they are CUI workflows as well):
View attachment 3012988
This reminded me of something which happened during all my attempts at training that.
When captioning the images, in this case with blip2, all of the images got captioned in the way you'd expect, woman clothing/description setting ....except one of them. All it gave as caption was "lara croft sexy png" (no it wasn't an actual png file). When i included an instruction to describe the "look" of the person in the picture, all i got was "sexy"...
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
Is it quality wise the same output just done more efficient, or is the SD script a bit inferior compared to hiresfix?
I think the difference is somewhat small but hiresfix gives clearly superior results. It is however very time inefficient. It is very necessary though since the bigger the resolution you get in txt2img the more detail and sharpness you will have in the end. The advantage it has is that it is part of the first generative process and has more impact on the image. Later upscaling doesn't have nearly as large effect and you quickly run into diminishing returns. Unless you have specific resolution target I think that most of the time it's enough with hiresfix. You can tease out the rest of the details with photoshop. Since you are looking for clarity I highly recommend that you learn to be very familiar with the camera raw filter in photoshop. It even has a specific slider named clarity. I have done a small guide or rather an overview of photoshop filters before but I might do a more detailed guide on "camera raw filter" and "smart sharpness" .

My small overview of a few filters and what not in photoshop
https://f95zone.to/threads/stable-diffusion-prompt-sharing-and-learning-thread.146036/post-10174688

Small overview of nerual filters in photoshop.
https://f95zone.to/threads/stable-diffusion-prompt-sharing-and-learning-thread.146036/post-10175271
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
Apropos the discussion around Upscalers, I've run some comparisons, using 3 Upscalers I've used before and one that has always just given terrible results*.
There are 20 images in total - 4 just using each Upscaler in HiRes fix (45 sampling steps, 60 Hires steps, denoising 0.33), then I ran each of those four images through 3x Upscaling in Extras for each of the four Upscalers.

N.B. These images are intended to be as 'photo-realistic' as possible, other "hyper-realism" or cartoon styles may give very different results.
I had ADetailer set for both face and hands, there's also kkw-skindet in the embeddings too.

*ESRGAN_4x, 4x-UltraSharp, 4xNMKDSuperscale & R-ESRGAN 4x+. The latter is terrible for photo-realism.

Looking at them, my personal preference seems to be either ESRGAN_4x for HiRes fix then 4xNMKDSuperscale for upscaling in Extras or 4xNMKDSuperscale for both.

The finished images are too large to upload direct, so a RAR.

The initial Hires fixed images are below:
View attachment 3012589 View attachment 3012590 View attachment 3012591 View attachment 3012592

N.B. For some reason, 4xNMKDSuperscale shows in my A1111 as "4xNMKDSuperscale_4xNMKDSuperscale". I've no idea why, but I've kept it in the filenames for completeness
You should do a comparison with only the denoising strength settings as variable. I have found that it has a very big effect on the result. If you set this wrong you could get a very poor result and think that it's the upscaler's fault when in actuality you had wrong denoising strength. The better comparison would have been either with hiresfix or with SD Upscale script. The normal upscaling in either img2img or extras tab doesn't add more detail or sharpness since it's not part of the generative process it only enlargens the image, it lacks the hires steps that both hiresfix and the SD Uspcale extension has.
 

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
You should do a comparison with only the denoising strength settings as variable. I have found that it has a very big effect on the result. If you set this wrong you could get a very poor result and think that it's the upscaler's fault when in actuality you had wrong denoising strength. The better comparison would have been either with hiresfix or with SD Upscale script. The normal upscaling in either img2img or extras tab doesn't add more detail or sharpness since it's not part of the generative process it only enlargens the image, it lacks the hires steps that both hiresfix and the SD Uspcale extension has.
I did want to add the denoising strength as an option but it would've been excessively time-consuming. I think I might do it tomorrow, but without the additional step of Extras upscaling.
It's very likely true that different denoising strengths suit different upscalers better. 0.33 is possibly very high for some.
As I'll only be doing the three decent ones from the last test (not going to waste my time and leccy on R-ESRGAN-4x+ any more), does anyone have a request for a fourth upscaler to take its place?
The plan will be each upscaler at 0.15, 0.2, 0.25, 0.3 & 0.4, other conditions as before to allow comparison and to allow the previous 0.33 to stand in the lineup.

The reason I included the Extras upscaling was to address exactly your point - I think I've proved us correct!
 
Last edited:
  • Red Heart
  • Like
Reactions: Mr-Fox and Sepheyer

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
I did want to add the denoising strength as an option but it would've been excessively time-consuming. I think I might do it tomorrow, but without the additional step of Extras upscaling.
It's very likely true that different denoising strengths suit different upscalers better. 0.33 is possibly very high for some.
As I'll only be doing the three decent ones from the last test (not going to waste my time and leccy on R-ESRGAN-4x+ any more), does anyone have a request for a fourth upscaler to take its place?
The plan will be each upscaler at 0.15, 0.2, 0.25, 0.3 & 0.4, other conditions as before to allow comparison and to allow the previous 0.33 to stand in the lineup.

The reason I included the Extras upscaling was to address exactly your point - I think I've proved us correct!
I use NMKD-Face very often. Even though I have talked about it many times here I don't see many others use it. Since it's common that people wants detailed faces I think this is a very good candidate. The problem with NMKD is though that they have many versions of the same upscaler.

NMKD Upscalers.png
 

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
I use NMKD-Face very often. Even though I have talked about it many times here I don't see many others use it. Since it's common that people wants detailed faces I think this is a very good candidate. The problem with NMKD is though that they have many versions of the same upscaler.

View attachment 3013616
So it would be SP_178000_G?
Also, one of the reasons I like to test on topless photos (apart from the obvious eye candy) is that the nipples and areolae are a distinct area of skin that's different to elsewhere but without a dedicated function to improve it.

Unless anyone knows of an ADetailer equivalent for nipples? :)
 
  • Like
Reactions: Mr-Fox and Sepheyer

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
So it would be SP_178000_G?
Also, one of the reasons I like to test on topless photos (apart from the obvious eye candy) is that the nipples and areolae are a distinct area of skin that's different to elsewhere but without a dedicated function to improve it.

Unless anyone knows of an ADetailer equivalent for nipples? :)
No 8x_NMKDFaceExtended_100000_G the other one just happened to be selected.

Yes there are other models for After Detailer.
After Detailer Models.png

You can download the extras from my gofile. Unzip in Stable-Diffusion\stable-diffusion-webui\models.


Becides ADetailer there are plenty of loras and embedings.


Here's a selection:

1697564414496.png


1697564467555.png


1697564527234.png


1697564582059.png


1697564633199.png


1697564691032.png
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,802
Tho... Great tool for bringing your DAZ/HS2/VAM waifus to life:

Source: Honey Select 2 renderSampler: Euler Ancestral
Woman, Teenager
Sampler: Euler Ancestral
Woman, Milf
View attachment 3013023 View attachment 3013021 View attachment 3013024
Sampler: Heun
Woman, Supermodel
Sampler: Heun
Woman
Sampler: Heun
Woman, MILF
View attachment 3013343 View attachment 3013122 View attachment 3013195
Sampler: Heun
Various embellishments
Sampler: Heun
Different mixing proportions
Sampler: Heun
Different mixing proportions
View attachment 3013381 View attachment 3013939 View attachment 3013904
Sampler: Heun
Different mixing proportions
Sampler: Heun
Different mixing proportions
Sampler: Heun
Different mixing proportions
View attachment 3013878 View attachment 3013866 View attachment 3013849
Holy crap!

lotion-kermit.gif
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,571
3,768
A few more tests of converting a rendered character into a photo (everything is a CUI prompt).

A couple of notes:
- Some images have very little in terms of textual prompt, others are more detailed. The best results are those where the prompts are more detailed at describing the scene - even for the strict mixes. So, definitely, want to describe the girl, her clothes, and the location. Not too long, rather: "(nature, asphalt road, trees, summer:1.2)."
- I just realized the strict mix workflow in some instances was missing the pokies - hence sometimes there are no nipple indentations on clothes.
- 70% of each set has the same seed and the faces are notably different although the features are retained. The reason the seed is not the same - some images were discarded cause they were outright crap and replaced with renders based on another seed.

 
Last edited:

Jimwalrus

Well-Known Member
Sep 15, 2021
1,045
3,994
OK, test completed.
It's looking to me like, for photorealistic images of young ladies, the best is either 4xNMKDSuperScale at about 0.25 Denoising or, as the fantastic Mr-Fox recommended, 8x_NMKDFacesExtended_100000_G* at 0.25 or 0.3

You may feel differently, the gains are subtle.

Here's the X/Y plot in full resolution:



*Just trips off the tongue, don't it? :rolleyes:
 

me3

Member
Dec 31, 2016
316
708
This seems to have the same "issues" as other "convert to real" methods. It converts the pose, background is fairly alike and the character is wearing relatively the same clothing, hair etc. However the face isn't really the all that close.
The face your render character has isn't in any way a "unrealistic anime" shape or features, yet when you look at the "real" version it hasn't even kept the basic shape of the face. The face is more rounded and "shorter", chin is different, eyes, lips, some of this could be prompting related sure, but AI is mean to be good at reading faces (scanners/cameras etc), but for things like this is doesn't seem to keep even the proportions "correct", which is exactly what is used for comparing faces.
 
  • Like
Reactions: rogue_69

hkennereth

Member
Mar 3, 2019
237
775
This seems to have the same "issues" as other "convert to real" methods. It converts the pose, background is fairly alike and the character is wearing relatively the same clothing, hair etc. However the face isn't really the all that close.
The face your render character has isn't in any way a "unrealistic anime" shape or features, yet when you look at the "real" version it hasn't even kept the basic shape of the face. The face is more rounded and "shorter", chin is different, eyes, lips, some of this could be prompting related sure, but AI is mean to be good at reading faces (scanners/cameras etc), but for things like this is doesn't seem to keep even the proportions "correct", which is exactly what is used for comparing faces.
That's a matter of prompting for realism vs. anime, choosing a model that is better at realism vs. one focused on anime, and the specifics of how one sets the render parameters, meaning those were all choices made to get results that looked more like photographs instead of 3D anime characters with better render techniques. You can't convert to "real" using an anime face and get the exact same proportions, because anime characters don't have realistic proportions by design. If you want you may make difference choices, but there isn't a limitation on the technology stopping you from getting results more alike the source images.
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,571
3,768
This seems to have the same "issues" as other "convert to real" methods. It converts the pose, background is fairly alike and the character is wearing relatively the same clothing, hair etc. However the face isn't really the all that close.
The face your render character has isn't in any way a "unrealistic anime" shape or features, yet when you look at the "real" version it hasn't even kept the basic shape of the face. The face is more rounded and "shorter", chin is different, eyes, lips, some of this could be prompting related sure, but AI is mean to be good at reading faces (scanners/cameras etc), but for things like this is doesn't seem to keep even the proportions "correct", which is exactly what is used for comparing faces.
I am not disagreeing with what you wrote, I am may be somewhat protesting against the idea that this tech didn't capture likeness. In short, I find it did a really good job with some originals, not so much with others. And this one comparison is the very best at mapping that I saw so far. If only I could keep the output consistent, that would be great:

comparison.png

In my mind this is an A+, nailed down: the tech successfully mapped a cartoon charactr to what I thought she would look like IRL.

The original girl is Nina Williams from Tekken(?):

Untitled.png
A mere haircut adjustment and softer expression multiplied by different renderer (Honey Select 2 vs the original) were enough to make her a completely different person. Now, in Honey Select 2 the model does look like NW, but the likeness comes and goes depending on the light, angle and bunch of pre/post- settings. Overall it does a really good job of capturing the likeness, but one has to keep in mind, that even in real life a mere haircut change can completely change a person. So, asking for a 3D renderer to do better than the real life can is beyond my scope.

And finally, we have a ton of models and a ton of settings which give us a quadrillion of permutations. By settings I mean not only the weight/noise permutation on each sampler and upscaler but also where in the entire pipeline we inject the latent. As I like to say, this is a 50! (fifty-factorial) freedom of choice.

And to illustrate that the render can look more like the original cartoon here is one such permutation involving a "late" latent injection:

a_01462_.png

And here a merely different model:

a_01481_.png

Now, these two images are actually closer towards the original HS2 render and they would move us towards more direct mapping between the original's look and post-IPAdapter look. But that wasn't my goal, and instead we have a bunch of images that we had. So, I would not at all say that this tech doesn't let us capture the likeness of the very original. It does - it is a matter of the settings / model.
 
Last edited:
  • I just jizzed my pants
  • Like
Reactions: Mr-Fox and mams3425

me3

Member
Dec 31, 2016
316
708
You both miss my point, quite a bit too for some things it seems, but i don't think it's worth wasting ppls time with it or derailing this thread with something most ppl probably (and seemingly) don't notice or care about :)
 
  • Like
Reactions: Mr-Fox

rogue_69

Newbie
Nov 9, 2021
87
298
I am not disagreeing with what you wrote, I am may be somewhat protesting against the idea that this tech didn't capture likeness. In short, I find it did a really good job with some originals, not so much with others. And this one comparison is the very best at mapping that I saw so far. If only I could keep the output consistent, that would be great:
Only you can decide what matters and doesn't matter. If you want more dynamic looking images, you have to sacrifice a little consistency. If you're going this route, I'd suggest switching to Daz. Using Canvas Rendering, you can isolate the clothing, background, and even hair from the actual character. Then you can focus on getting your characters looking the way you want, then overlaying the clothing. You can do this with hair too, but once you start prompting facial expressions, the face changes shape, and the hair won't fit as well without post-work.
 
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,571
3,768
You both miss my point, quite a bit too for some things it seems, but i don't think it's worth wasting ppls time with it or derailing this thread with something most ppl probably (and seemingly) don't notice or care about :)
I re-read your post #2506 and I finally got your point! It is genius. I even printed out your post, showed it to my parents, they were in awe; we decided to put your post up on our fridge.
:cool:
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,571
3,768
Only you can decide what matters and doesn't matter. If you want more dynamic looking images, you have to sacrifice a little consistency. If you're going this route, I'd suggest switching to Daz. Using Canvas Rendering, you can isolate the clothing, background, and even hair from the actual character. Then you can focus on getting your characters looking the way you want, then overlaying the clothing. You can do this with hair too, but once you start prompting facial expressions, the face changes shape, and the hair won't fit as well without post-work.
I do have a workflow in mind, it involves a ComfyUI node "Segment Anyting" and can create dynamic masks. I do have high hopes for it, as the LORAs were a bit of a disappointment. But I am struggling to get the S/A node to work. So it will be some time before I test the entire thing - so far these IPAdapter tests were to familiarize myself with the tool. Although these are merely component tests and the entire aggregate workflow tests are yet ahead.
 
  • Like
Reactions: Mr-Fox

hkennereth

Member
Mar 3, 2019
237
775
You both miss my point, quite a bit too for some things it seems, but i don't think it's worth wasting ppls time with it or derailing this thread with something most ppl probably (and seemingly) don't notice or care about :)
We did get your point. What I pointed out was that you are treating as a technological limitation or problem with the process, what in reality are just creative choices made by Sepheyer, and they were very clear on what those choices were. Stable Diffusion doesn't have a problem making those images look more similar to the Honey Select source images, it's actually perfectly capable of achieving that, but they didn't aim to make those images as similar as you would like.

You can't just point out what you don't like about their images and complain that it doesn't fit what you would like to see happen. If there is something you'd like to accomplish you should give it a try yourself, and if you're having issues reaching specific goals we can help you get there. But pointing out "problems" on other people's images because they don't fit your goals is pretty pointless, because they are not trying to achieve what you want to achieve.
 
  • Like
Reactions: Mr-Fox and Sepheyer