[Stable Diffusion] Prompt Sharing and Learning Thread

Jimwalrus

Well-Known Member
Sep 15, 2021
1,042
3,984
This probably isn't the right thread, but I wanted to get a conversation going with people who actually use A.I.

If you google questions like "Do gamers care if the art is A.I. generated, you get a majority of the responses from artists (or so it seems). I understand them hating the concept of A.I. art. To them, the art part of making a game is the part that should get the most effort. If the creator used A.I. art, they are lazy, even if they spent a ton of time on story, programming, etc. Using A.I. art is theft (there is a fair point here, but that will change as things advance). You can't get consistent characters (that is rapidly becoming easier).

What I'm wondering, is if the average adult game consumer really cares how the art was created, or if that is a talking point from artists who don't want to lose gigs in the future. Players have different things they deem to be the most important part of the game. For some it is art. For some it is story. I personally feel that starting next year, most people downloading these types of games won't give a rat's *ss if the art of these games was A.I. generated (or more likely, A.I. being used somewhere in the process).
For me, the quality improvements of AI-generated imagery over DAZ/drawn images are the real clincher.
For legit AAA titles it may be a somewhat different matter, but for the cottage industry of porn games it'll be an absolute game changer as soon as someone gets it right.
It's really just another tool like DAZ, Photoshop etc and therefore quality is still not guaranteed - there's no shortage of current porn games with poor drawings or lazy 3D renders.
Give it a few years and there may be even more cookie-cutter AI generated VNs than DAZ ones!
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,570
3,766
For me, the quality improvements of AI-generated imagery over DAZ/drawn images are the real clincher.
For legit AAA titles it may be a somewhat different matter, but for the cottage industry of porn games it'll be an absolute game changer as soon as someone gets it right.
It's really just another tool like DAZ, Photoshop etc and therefore quality is still not guaranteed - there's no shortage of current porn games with poor drawings or lazy 3D renders.
Give it a few years and there may be even more cookie-cutter AI generated VNs than DAZ ones!
Yea, the consistent subjects is where it is at. The moment that problem is solved, everything will turn AI overnight.
 

rogue_69

Newbie
Nov 9, 2021
87
298
Yea, the consistent subjects is where it is at. The moment that problem is solved, everything will turn AI overnight.
For me, the Daz to Stable Diffusion workflow gives the best consistency. No matter what you do, Daz renders always have the "plastic" look, but with a denoiser of about 0.35 you can take a really good render, and turn it into something great, while also keeping things consistent. Consistency in clothing is the biggest problem, but if you just render the characters nude, and then use a Canvas render of the clothing, you can just overlay the clothes and keep them consistent.
 
  • Like
Reactions: Jimwalrus

hkennereth

Member
Mar 3, 2019
237
775
Yea, the consistent subjects is where it is at. The moment that problem is solved, everything will turn AI overnight.
I mean, it's not that hard. You can get reasonably consistent characters with the technique I posted about before (describing a mix of multiple existing people), and if you need them more consistent, you can train a lora for that person.
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,570
3,766
For me, the Daz to Stable Diffusion workflow gives the best consistency. No matter what you do, Daz renders always have the "plastic" look, but with a denoiser of about 0.35 you can take a really good render, and turn it into something great, while also keeping things consistent. Consistency in clothing is the biggest problem, but if you just render the characters nude, and then use a Canvas render of the clothing, you can just overlay the clothes and keep them consistent.
I love my i2i workflow too but the shortcomings are deep and prohibitive to fix. Between sidefaces and fingers and nipples and hair the issues are just too numerous to provide for consistent characters. Naturally, I can fix any of those flaws but the time dropped into it makes the whole things useless for mass-production.
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,570
3,766
I mean, it's not that hard. You can get reasonably consistent characters with the technique I posted about before (describing a mix of multiple existing people), and if you need them more consistent, you can train a lora for that person.
We probably understand word "consistent" somewhat differently. I haven't seen a single consistent solution, although a lot of people call the very same solutions as providing consistent results. My bar for consistent is prolly a tad higher rendering the AI useless for the purpose of ~90% of the asset design.
 

theMickey_

Engaged Member
Mar 19, 2020
2,193
2,824
Quick question (another one, I know): I'm struggling with getting some decent "photo realistic" pictures with ComfyUI, and I think I'm probably still missing something in my workflow. When I look at some "models" and "LoRAs" posted on , I am able to see all the details they (apparently) used to create said picture. Positive prompt, negative prompt, cfg and seed values as well at the sampler/scheduler used.

But when I try to reproduce some of them, my results are way off from what is posted on ComfyUI.

Example: I've been trying to reproduce something like , but no matter what workflow I'm trying to create (using the same checkpoints and LoRAs as posted in the link), the images generated are blurry (especially when upscaled) and don't look realistic at all.

Are these images posted on post-processed through something like Photoshop?
What additional (essential) nodes do I need to add to my workflow to make the results more "crisp" and realistic?

This is what my current (basic) workflow looks like:
workflow.png

There's a couple of bypassed nodes which I was experimenting with, but without success.

This is what I want to achieve (all credits for this image goes to on !):
ba808960-cf03-40d0-8069-c4f30449adcf.jpeg

but this is what I get (using the exact same checkpoint, LoRA, prompts, cfg and seed values as well as the same sampler/scheduler and a square image):
ComfyUI_temp_sjhzv_00165_.png

Any help would be much appreciated!
 
  • Like
Reactions: Sepheyer

hkennereth

Member
Mar 3, 2019
237
775
We probably understand word "consistent" somewhat differently. I haven't seen a single consistent solution, although a lot of people call the very same solutions as providing consistent results. My bar for consistent is prolly a tad higher rendering the AI useless for the purpose of ~90% of the asset design.
Well, I did say "reasonably" :)

But you're not wrong, the thing is that AI tends to add a lot of variation even if it knows exactly what you are asking for, and you are very specific with your prompting. While the image quality has been growing quickly in the last year or so, the bar for absolute consistency is still very far because that's simply not what the tech was designed to accomplish. Even the best LoRA and Dreambooth models I have ever seen will still change face shape a bit, suddenly modify eye colors, etc.

And as rogue_69 said, truly consistent clothing is near impossible to achieve with Stable Diffusion, again that's not what the tech was designed to achieve. We might still be at least a couple of generations away from being able to create a character dressed in some specific clothing and have that character rendered in different poses, environments, and sizes. But one can get close "enough" for the average use case, I think.
 

me3

Member
Dec 31, 2016
316
708
This probably isn't the right thread, but I wanted to get a conversation going with people who actually use A.I.

If you google questions like "Do gamers care if the art is A.I. generated, you get a majority of the responses from artists (or so it seems). I understand them hating the concept of A.I. art. To them, the art part of making a game is the part that should get the most effort. If the creator used A.I. art, they are lazy, even if they spent a ton of time on story, programming, etc. Using A.I. art is theft (there is a fair point here, but that will change as things advance). You can't get consistent characters (that is rapidly becoming easier).

What I'm wondering, is if the average adult game consumer really cares how the art was created, or if that is a talking point from artists who don't want to lose gigs in the future. Players have different things they deem to be the most important part of the game. For some it is art. For some it is story. I personally feel that starting next year, most people downloading these types of games won't give a rat's *ss if the art of these games was A.I. generated (or more likely, A.I. being used somewhere in the process).
It's a tool like many others and as with those tools there's gonna be ppl making low quality shit and others that make "works of art". So using it for original contents shouldn't be any different than any other "tool".
As for "artists" worried about their paychecks, just look at all the concerns that have been with mass production, or "automation". Yes there's less need for ppl in massive numbers, however those that's actually good and like doing the job in question has found ways to make quite a lot of money. Consider how much money is being paid for "handcrafted" or "costume jobs" in things like woodworking or cars/bikes. If you make/do something ppl are actually interested in, someone will be interested in paying for it, regardless of there being a "cheap and mass produced" option and you'll probably be able to charge more for it (eventually).
Quality is what matters, far more than the tool, but considering the "quality" of a lot of games, movies, etc over the past few years, i don't think AI is the biggest concern in that regard...that's a whole other issue though
 

hkennereth

Member
Mar 3, 2019
237
775
Quick question (another one, I know): I'm struggling with getting some decent "photo realistic" pictures with ComfyUI, and I think I'm probably still missing something in my workflow. When I look at some "models" and "LoRAs" posted on , I am able to see all the details they (apparently) used to create said picture. Positive prompt, negative prompt, cfg and seed values as well at the sampler/scheduler used.

But when I try to reproduce some of them, my results are way off from what is posted on ComfyUI.

Example: I've been trying to reproduce something like , but no matter what workflow I'm trying to create (using the same checkpoints and LoRAs as posted in the link), the images generated are blurry (especially when upscaled) and don't look realistic at all.

Are these images posted on post-processed through something like Photoshop?
What additional (essential) nodes do I need to add to my workflow to make the results more "crisp" and realistic?

This is what my current (basic) workflow looks like:

There's a couple of bypassed nodes which I was experimenting with, but without success.

This is what I want to achieve (all credits for this image goes to on CivitzAI!):

but this is what I get (using the exact same checkpoint, LoRA, prompts, cfg and seed values as well as the same sampler/scheduler and a square image):

Any help would be much appreciated!
Try to remove the negative prompt to begin with. There are a lot of terms there that are absolutely unnecessary and most often might cause issues. Negative prompting stuff like "acne" or "overexposure" doesn't work. Especially when using SDXL, do not add anything to the negative unless it's an actual element (like an object or person) that you don't want to see in the image. The idea that you can ask it to "don't make it look bad" is a fallacy; training images are not tagged with their defects so SD has no idea what you're talking about.
 
  • Red Heart
Reactions: theMickey_

theMickey_

Engaged Member
Mar 19, 2020
2,193
2,824
Try to remove the negative prompt to begin with.
Thank you for the reply!

I did read a lot about negative prompts, and although a lot of people tend to add a while bunch of it, most of them are unnecessary, I agree! With negative prompt the AI will try to create the quite the opposite of what you've prompted, and this might lead to unwanted results in the first place. So my guess is "less is more" when it comes to negative prompting.

So this was the first thing I've tried, but I still wasn't able to achieve what I was looking for unfortunately :cautious:. The pictures I get still look kinda "fake" and blurry... (but to be fair, they do look better than the example I've posted above!)
 

me3

Member
Dec 31, 2016
316
708
Quick question (another one, I know): I'm struggling with getting some decent "photo realistic" pictures with ComfyUI, and I think I'm probably still missing something in my workflow. When I look at some "models" and "LoRAs" posted on , I am able to see all the details they (apparently) used to create said picture. Positive prompt, negative prompt, cfg and seed values as well at the sampler/scheduler used.

But when I try to reproduce some of them, my results are way off from what is posted on ComfyUI.

Example: I've been trying to reproduce something like , but no matter what workflow I'm trying to create (using the same checkpoints and LoRAs as posted in the link), the images generated are blurry (especially when upscaled) and don't look realistic at all.

Are these images posted on post-processed through something like Photoshop?
What additional (essential) nodes do I need to add to my workflow to make the results more "crisp" and realistic?

This is what my current (basic) workflow looks like:

There's a couple of bypassed nodes which I was experimenting with, but without success.

This is what I want to achieve (all credits for this image goes to on !):

but this is what I get (using the exact same checkpoint, LoRA, prompts, cfg and seed values as well as the same sampler/scheduler and a square image):

Any help would be much appreciated!
XL has 2 text encoders, L and G, while it can work just fine passing the same prompt to both, there are times you can get some "interesting" differences. Just as a general note, might not make any difference in this case.
Not sure how that specific node works but if "clip scale" is a rewording or linked to "clip skip", in comfyui the clip skip values are negative. So if a image is from something like A1111 and has a clip skip of 2, in comfyui that would be -2.

Considering there's seems to be no prompt data in the image, it's rather hard to recreate it exactly.
Going by your posted workflow image though; remove the <lora....> bit from the prompt, and adjust the "strength" in your load lora node to 0.75, your step count is off, says 30 on site, probably shouldn't matter but better safe. Your sampler is also wrong, that can have a huge difference.
Posted image is also at a different width/height, probably upscaled/highres.fix, if it's generated in that width from the start it will look different than yours at 1024x1024.
Prompt is handled slightly different in comfyui compared to A1111, so you can try changing the prompt parser to use a1111 style, there's a node for it.
Might be more difference but you got somewhere to start, i doubt you'll get it 100% considering there might be important details missing since the full prompt/generation isn't included in the image.
 
  • Like
Reactions: sharlotte

hkennereth

Member
Mar 3, 2019
237
775
Thank you for the reply!

I did read a lot about negative prompts, and although a lot of people tend to add a while bunch of it, most of them are unnecessary, I agree! With negative prompt the AI will try to create the quite the opposite of what you've prompted, and this might lead to unwanted results in the first place. So my guess is "less is more" when it comes to negative prompting.

So this was the first thing I've tried, but I still wasn't able to achieve what I was looking for unfortunately :cautious:. The pictures I get still look kinda "fake" and blurry... (but to be fair, they do look better than the example I've posted above!)
The concept of "less is more" is also valid for the positive prompt. Looking at it I also see a whole bunch of things that don't help at all (for example lora:TWBabe... is how you load LoRAs in A1111, and doesn't work on Comfy, therefore can be interpreted as anything), are repeated (best quality, ultra highres, which are also pretty useless), are contradictory (Mandalorian armor covers? On a swimsuit picture?), or are not meant to be used with SDXL (1girl is a tag used on SD1.5 models trained for anime pictures).

I also don't know what many of the nodes are meant to do, and a simpler Comfy workflow would probably help make images better by not adding stuff to the process that isn't being used. May I assume that this is a workflow that you got from someone else? I think it would be beneficial to start with something simpler and add more nodes yourself as you better understand your needs, and exactly what each node is adding to the process. I can recommend , one of the team members at Stability AI responsible for creating SDXL, where he talks about how to create a node set up for SDXL from scratch.
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,570
3,766
So, Mattheo is prolly the IPAdapter expert given he converted/implemented the IPA for ComfyUI.

He got a new video out an hour ago about repeatability of the characters:

I'll be testing his ideas in the due course. But I gotta say for a quick second that somehow while being on the IPA kick for a while I eventually moved away from it towards the ControlNet's tile control net. I think it was the IPA's 512x512 (?) requirement limitation that eventuall made me say fuck this, imma tile my image-to-image workflows from now on rather than IPA them.

Yeaa. Oh, yea, and the mandatory post pic:

a_03125_.png
 
Last edited:

me3

Member
Dec 31, 2016
316
708
So sticking to form, this time the shirt decides to completely freak out and she has some hairstyle issues...and some smaller stuff, but getting closer. No character lora used.
webp would be far to big to post so hopefully mp4 works, at least it let me upload and attach, hopefully the forum hasn't screwed with the file too much, for reference it should be 488x800 at 60fps

View attachment 1.mp4
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,570
3,766
So sticking to form, this time the shirt decides to completely freak out and she has some hairstyle issues...and some smaller stuff, but getting closer. No character lora used.
webp would be far to big to post so hopefully mp4 works, at least it let me upload and attach, hopefully the forum hasn't screwed with the file too much, for reference it should be 488x800 at 60fps
Is this an image-to-image rendering under the hood?
 

me3

Member
Dec 31, 2016
316
708
Is this an image-to-image rendering under the hood?
it's much closer to it than i'd like at least. Background is a single image added as the "back" layer to all frames. To try and keep the movement of everything else there's a combination of a low weight lineart and a pose skeleton.
Affecting colors work and in some ways shape/size, but you can clearly see there's a fight going on, specially with the shirts neckline
It restrict things far too much for my intention and what i'd like. If you look at many of the clips/videos ppl are posting you start to notice that a huge amount of them are just "reskinned" videos processed in a img2img way which locks you into not just movement but also the general "shape" of what ever is in the clip originally.
I mainly wanted to see how it worked with a single background image and it wasn't really meant to be on the full 660 frames, but by the time i'd gotten back and noticed the folder path was wrong it wasn't much point in not finishing the whole run.
Hoping to find a way to have a simple "skeleton" that you can wrap any character to and not be locked into the same/look of what ever it's from. This isn't it, and i suspected it wouldn't be, but i've tried quite a few other ways that doesn't work either. Any control net i've found (besides "pose") locks you in too much. Including temporal which for me at least seems to fuck up colors too.
Considering some method of sampling > masking > unsampling > resampling... etc. It works with small things like expressions, but i fear it'll be a lot of stuff to keep track of with whole body movements. Not sure if there's a simple way to track/detect differences. Anyway, long road, but you learn something along the way...

Edit:
Adding 2 group images so you can see some of the images involved and "stages". Not from the generation of the video by same setup with just slightly altered prompt and different model.
comb_0001.png comb_0002.png
 
Last edited:
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,570
3,766
it's much closer to it than i'd like at least. Background is a single image added as the "back" layer to all frames. To try and keep the movement of everything else there's a combination of a low weight lineart and a pose skeleton.
Affecting colors work and in some ways shape/size, but you can clearly see there's a fight going on, specially with the shirts neckline
It restrict things far too much for my intention and what i'd like. If you look at many of the clips/videos ppl are posting you start to notice that a huge amount of them are just "reskinned" videos processed in a img2img way which locks you into not just movement but also the general "shape" of what ever is in the clip originally.
I mainly wanted to see how it worked with a single background image and it wasn't really meant to be on the full 660 frames, but by the time i'd gotten back and noticed the folder path was wrong it wasn't much point in not finishing the whole run.
Hoping to find a way to have a simple "skeleton" that you can wrap any character to and not be locked into the same/look of what ever it's from. This isn't it, and i suspected it wouldn't be, but i've tried quite a few other ways that doesn't work either. Any control net i've found (besides "pose") locks you in too much. Including temporal which for me at least seems to fuck up colors too.
Considering some method of sampling > masking > unsampling > resampling... etc. It works with small things like expressions, but i fear it'll be a lot of stuff to keep track of with whole body movements. Not sure if there's a simple way to track/detect differences. Anyway, long road, but you learn something along the way...
Indeed, I have the same experience when attempting i2i frames using controlnets, the end results are nothing but acid trips.