[Stable Diffusion] Prompt Sharing and Learning Thread

Jimwalrus

Active Member
Sep 15, 2021
888
3,273
Hey thanks for the support. I don't think my video card has enough juice to train a lora.. or i'm not doing it properly..
I've watched a few walkthrough videos, but I can't seem to figure it out..
Civitai has an . It's not perfect as you don't have total control, but I've used it a few times. Costs 500 'Buzz' each time (even failures), so either get liking some other peoples' images or let me know your ID there and I'll tip you 1000 to let you have a play with it.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
There are not that many LCM checkpoints available as of yet compared to "normal" SD1.5 and SDXL, though indeed a few.


The point of LCM (Latent Consistency Model) is to be able to run fewer steps and lower cfg scale to cut down on generation time but still get a high quality.
The rule of thumb is 6-12 steps and 1-4 cfg scale. 10 steps and 1-2 cfg scale seems to be good with most models.
I ran a few ckpt compare tests with plot script and I borrowed the prompt of the great Devilkkw's delicious Cryptid Babe.

SD1.5 LCM 1024x1280 (notice the tendency for conjoined twins):

xyz_grid-0000-619202276.png
xyz_grid-0001-1496218127.png
xyz_grid-0002-429866939.png

SD1.5 LCM 640x960 (notice the absence of conjoined twins):

xyz_grid-0003-3442146865.png xyz_grid-0004-2667609524.png xyz_grid-0005-1363269724.png

There are also XL LCM models. As most know you can use a higher resolution with XL models.
The rule of thumb is that the resolution should be equal to 2048. You can try different ratios. One that I have found to work well for me is 896x1152.

SDXL Image resolutions.png
(Thanks to the eminent Synalon for providing this list).

SDXL LCM 896X1152:

xyz_grid-0007-4130536175.png

A tip is to never use the standard VAE that was released with the first SDXL model, it's slow as hell.
I recommend fenrisxl VAE instead, it's faster. SDXL LCM is still much slower in general though compared to normal SD1.5 or LCM, at least with an older GPU like the 1070 card I have.


SDXL LCM with fenrisxl VAE 896X1152:

xyz_grid-0008-3268892835.png
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
Bonus.

(twigs and pine cones included).. :LOL:

I hope the excellent Devilkkw doesn't mind I keep posting with his prompt, had too much fun to stop.:giggle:
The other day me and the eminent Synalon experimented with and explored variation seed, variation strength and more importantly variation resolution.

This is what the tip text says about it:
variation with n height.png

This means that you can generate an image in landscape without getting eldritch monsters by setting the main resolution in landscape ratio and variation resolution in portrait ratio. See examples below.

00063-1845472706.png 00068-1335331501.png
00069-3616288449.png 00071-2324126865.png
 
Last edited:

me3

Member
Dec 31, 2016
316
708
Bonus.

(twigs and pine cones included).. :LOL:

I hope the excellent Devilkkw doesn't mind I keep posting with his prompt, had too much fun to stop.:giggle:
The other day me and the eminent Synalon experimented with and explored variation seed, variation strength and more importantly variation resolution.

This is what the tip text says about it:
View attachment 3268917

This means that you can generate an image in landscape without getting eldritch monsters by setting the main resolution in landscape ratio and the variation resolution in portrait ratio. See examples below.

View attachment 3268862 View attachment 3268863
View attachment 3268882 View attachment 3268928
How you describe the variation resolution sounds similar to how you can use width/height and target width/target height in sdxl text encoder in comfyui. If that's the case it's very useful, since among many things you can "zoom" in/out or generation, and decide what gets "cropped out" and how it fits on your "canvas"

Regarding LCM, you don't have to have a model trained for it, there are LCM weight loras, for both SD15 and XL, which lets you use any model and you can create images at less steps and cfg. You use them as any other lora and they work pretty well.




There's a Lora for SDXL Turbo too


And one that combine both LCM and Turbo


Edit:
Since i had to go look for the vae Mr-Fox mention (compulsive need to try new things), it wasn't that easy to find right away, too many model version listed, so here's
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
How you describe the variation resolution sounds similar to how you can use width/height and target width/target height in sdxl text encoder in comfyui. If that's the case it's very useful, since among many things you can "zoom" in/out or generation, and decide what gets "cropped out" and how it fits on your "canvas"

Regarding LCM, you don't have to have a model trained for it, there are LCM weight loras, for both SD15 and XL, which lets you use any model and you can create images at less steps and cfg. You use them as any other lora and they work pretty well.




There's a Lora for SDXL Turbo too


And one that combine both LCM and Turbo
When ever I have tried the loras I have had issues. Maybe I did it wrong.. :LOL:
Thank you for the links I will check it out. :) (y)
 
  • Like
Reactions: devilkkw

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
Edit:
Since i had to go look for the vae Mr-Fox mention (compulsive need to try new things), it wasn't that easy to find right away, too many model version listed, so here's
My bad. I have updated the post with the link. Should not be an issue now that we both have linked to it. :D
Btw, the FenrisXL model itself is also excellent. Now this can't possibly need it's own link..

But here it is anyway:


Image Example:
https://f95zone.to/threads/ai-art-show-us-your-ai-skill-no-teens.138575/post-11967954
 
Last edited:

devilkkw

Member
Mar 17, 2021
301
1,029
That's some mighty fine Wendigo Erotica. :D (y) A little tip or just observation, the LCM sampler will not give a good result if you are not using it with a LCM checkpoint. Also, if you use a resolution 1024 wih SD1.5 you are more likely to get conjoined twins. I would recommend to use 960x640 and then use either hiresfix or upscale in img2img with SD Upscale script. I know for a fact that you are already aware. This is only a reminder and for anyone else that might not be aware.
Oh, never see lcm models. And never try it, is possible to port standard .safetensors to lcm? benefit?
A good reminder you say Mr-Fox, sd 1.5 model work great with low res, and push it high is really a pain in the ass, many models get double and weird result at 768, so generate and upscale seem good solution.
But i have to ask a question: i use my model merged many times with merge block weight in a111, to push out resolution, but in a1111 max resolution reach 896x1152 and in CUI i reach 1024x1280. why so much different?
I also checked sampling method code, and seem work different in CUI and in a1111, but if the sampler is the same, why?

There are not that many LCM checkpoints available as of yet compared to "normal" SD1.5 and SDXL, though indeed a few.


The point of LCM (Latent Consistency Model) is to be able to run fewer steps and lower cfg scale to cut down on generation time but still get a high quality.
The rule of thumb is 6-12 steps and 1-4 cfg scale. 10 steps and 1-2 cfg scale seems to be good with most models.
I ran a few ckpt compare tests with plot script and I borrowed the prompt of the great Devilkkw's delicious Cryptid Babe.

SD1.5 LCM 1024x1280 (notice the tendency for conjoined twins):

View attachment 3267793
View attachment 3267794
View attachment 3267795

SD1.5 LCM 640x960 (notice the absence of conjoined twins):

View attachment 3267796 View attachment 3267797 View attachment 3267799

There are also XL LCM models. As most know you can use a higher resolution with XL models.
The rule of thumb is that the resolution should be equal to 2048. You can try different ratios. One that I have found to work well for me is 896x1152.

View attachment 3267808
(Thanks to the eminent Synalon for providing this list).

SDXL LCM 896X1152:

View attachment 3268390

A tip is to never use the standard VAE that was released with the first SDXL model, it's slow as hell.
I recommend fenrisxl VAE instead, it's faster. SDXL LCM is till much slower in general though compared to normal SD1.5 or LCM, at least with an older GPU like the 1070 card I have.


SDXL LCM with fenrisxl VAE 896X1152:

View attachment 3268594
Wow, love these type of post, is really useful for me to have a good idea on how it work.thank you.
A ot question: how many checkpoint you have? o_O :eek:
Bonus.

(twigs and pine cones included).. :LOL:

I hope the excellent Devilkkw doesn't mind I keep posting with his prompt, had too much fun to stop.:giggle:
The other day me and the eminent Synalon experimented with and explored variation seed, variation strength and more importantly variation resolution.

This is what the tip text says about it:
View attachment 3268917

This means that you can generate an image in landscape without getting eldritch monsters by setting the main resolution in landscape ratio and variation resolution in portrait ratio. See examples below.

View attachment 3268862 View attachment 3268863
View attachment 3268882 View attachment 3268928
So beautiful result's. i'm glade you used my prompt for sample. And good test, variation seed is bit underestimate,keep testing and share. I'm really interested on it.

Hey thanks for the support. I don't think my video card has enough juice to train a lora.. or i'm not doing it properly..
I've watched a few walkthrough videos, but I can't seem to figure it out..
Too general, what are your pc spec? what are you using for train? video driver version?
Also read some post ahead, other user give you a suggestion on a possible alternative way to train.
 
  • Red Heart
Reactions: Sepheyer

lobotomist

Active Member
Sep 4, 2017
823
725
sorry for the noob question, i have an intel card and i know that you can use auto 1111 with open vino. but what about comfy UI?

Also other recommendations for beggkners with intel cards?
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
Oh, never see lcm models. And never try it, is possible to port standard .safetensors to lcm? benefit?
I have no idea.
The benifit is like I said in my post to be able to use much less steps and lower cfg scale to cut down on generation time and still get high quality.

A good reminder you say Mr-Fox, sd 1.5 model work great with low res, and push it high is really a pain in the ass, many models get double and weird result at 768, so generate and upscale seem good solution.
But i have to ask a question: i use my model merged many times with merge block weight in a111, to push out resolution, but in a1111 max resolution reach 896x1152 and in CUI i reach 1024x1280. why so much different?
I also checked sampling method code, and seem work different in CUI and in a1111, but if the sampler is the same, why?
As I don't use the spagetti ui I can't help you with ComfyUi.
With SD1.5 it's best to keep it under 1024 in either direction. So I use 640x960 for portrait ratio, I simply flip it when I do landscape. I use this resolution while searching for a good seed and then when I found it
I then re-use that seed and enable hiresfix with 2x upscale and a fairly low denoising to get a sharp image. Then I might upscale it further with SD upscale script in img2img.

Wow, love these type of post, is really useful for me to have a good idea on how it work.thank you.
A ot question: how many checkpoint you have? o_O :eek:
Way too many probably.. :LOL:

So beautiful result's. i'm glade you used my prompt for sample. And good test, variation seed is bit underestimate,keep testing and share. I'm really interested on it.
I'm glad you liked it.:)
 
Last edited:
  • Like
Reactions: devilkkw

Microtom

Well-Known Member
Sep 5, 2017
1,072
3,680
I'm going to test training sdxl on pornographic concept by using color association to ease the formation of the neural network. I don't know anything about that but I assume it creates associations so it should work.

Essentially, I'll separate the image into two identical images, then color specific regions. Then I'll prompt what the colors are associated to. The AI knows what the colors are, so it will associate it with the concept to learn.

Here are some example images.

101030.png

101010.png


101032.png
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
I'm going to test training sdxl on pornographic concept by using color association to ease the formation of the neural network. I don't know anything about that but I assume it creates associations so it should work.

Essentially, I'll separate the image into two identical images, then color specific regions. Then I'll prompt what the colors are associated to. The AI knows what the colors are, so it will associate it with the concept to learn.

Here are some example images.

View attachment 3270379

View attachment 3270385


View attachment 3270388
That's very interesting. Let us know how it turns out. :) (y)
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
sorry for the noob question, i have an intel card and i know that you can use auto 1111 with open vino. but what about comfy UI?

Also other recommendations for beggkners with intel cards?
The most important aspect is the amount of Vram. The card needs to have at the absolute minimum of 4GB for generating images. The chip itself can't be slow as snails either. There are many settings and small things you can do if you suffer from low Vram.
You can add the argument "--low vram" to the "webui-user.bat" file. In the ui itself you can set different settings in the "lightest" mode. Then keeping the resolution low when generating and low amount of steps etc to begin with until you know what you can achieve with your card. Start with 512x512 or in portrait mode (2:3) you can go below 512, such as 344x512. Then use SD Upscale Script in img2img to make the image larger. Since this is using "tiling" you can upscale by 4x. This will get you to 1376x2048. Keep it to an easier style or genre such as anime or manga. With easier I only mean in terms of hardware requirement.

Example:
You don't have permission to view the spoiler content. Log in or register now.

344x512Upscaled to 1376x2048
00001-1940226755.png 00003-1940226755.png

If your card can't hack it then google colab or Stable pod type services might be the option for you. It's online server based image generation. On some sites you can rent a high end cards by the hour. This means there is nothing stopping you from training your own models etc. As long as you are willing to pay the hourly fee.
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
Another example for low Vram ppl.

This is also a challenge for anyone who wants to participate.
The point of the challenge is to be more creative with the prompt and come up with new innovative solutions within specified
limitations and without the usual toys. The basic idea is to emulate the challenge people have with an old weak GPU.
This is why we keep the resolution low and avoid using a bunch of extension in txt2img.
It's meant to be a learning exercise first and foremost and no competition.
No one will lynch you if you take small liberties but it's more fun if everyone try to stick to the "script".:)

The limitations:

- In txt2img.
Use low res 344x512 or 512x512.
No controlnet or after detailer etc and no roop or reactor.
Postprocessing is allowed.
Face restore is allowed if you really want to use it.
keep the prompt simple and under 90 tokens and no more than 2 loras or embeddings in total, preferably none.
You can choose any genre and concept, nude or SFW.

- In img2img.
you are free to use inpaint as much as you wish and after detailer in the interest of fixing hands or deformed details etc.
Maybe I'm wrong but I think it's less memory demanding when you already have an image to work with.
keep the prompt in after detailer somewhat simple also.
The same limit of loras and/or embedding (2) for after detailer as txt2img.
no controlnet, roop or reactor.
Use SD Upscaler Script with any upscaler you want and 2-4x to finalize the image.

Post both the image from txt2img and the final image from img2img so we can see the prompt and process.
Give a short description outlining the process and the general concept.
Also share any thoughts or reflections about things you might have discovered and learned.

The challenge will continue as long as someone is still interested.

Remember to have fun.
-------------------------------------------------------------------------------------------------------------------------------------------

In txt2img:
I will expand the prompt a little from before and see what I can achieve within these limits. I avoid using any extensions that will add to the memory demand, such as controlnet or after detailer etc in txt2img. I only use postprocessing GFPGAN as I don't think it is very demanding.

You don't have permission to view the spoiler content. Log in or register now.

In img2img:
I use after detailer for fixes and enhancing.
Lately I have experimented with using an alternative ckpt for after detailer with very interesting results.
I had to fix a tiny detail on the thumbs fingernail with photoshop for the first image.
A little "cheating" has never hurt anyone has it?..:giggle:
Then I turn off postprocessing GFPGAN and all model in after detailer with the exception of eyes before upscaling.
I upscale with SD Upscale Script with UltraSharp at 4x to finalize my image.

You don't have permission to view the spoiler content. Log in or register now.

344x512Upscaled to 1376x2048
00032-2421455531.png 00072-2421455531.png
00037-2421455531.png 00074-2421455531.png
 

Jimwalrus

Active Member
Sep 15, 2021
888
3,273
Another example for low Vram ppl.

This is also a challenge for anyone who wants to participate.
The point of the challenge is to be more creative with the prompt and come up with new innovative solutions within specified
limitations and without the usual toys. The basic idea is to emulate the challenge people have with an old weak GPU.
This is why we keep the resolution low and avoid using a bunch of extension in txt2img.
It's meant to be a learning exercise first and foremost and no competition.
No one will lynch you if you take small liberties but it's more fun if everyone try to stick to the "script".:)

The limitations:

- In txt2img.
Use low res 344x512 or 512x512.
No controlnet or after detailer etc and no roop or reactor.
Postprocessing is allowed.
Face restore is allowed if you really want to use it.
keep the prompt simple and under 90 tokens and no more than 2 loras or embeddings in total, preferably none.
You can choose any genre and concept, nude or SFW.

- In img2img.
you are free to use inpaint as much as you wish and after detailer in the interest of fixing hands or deformed details etc.
Maybe I'm wrong but I think it's less memory demanding when you already have an image to work with.
keep the prompt in after detailer somewhat simple also.
The same limit of loras and/or embedding (2) for after detailer as txt2img.
no controlnet, roop or reactor.
Use SD Upscaler Script with any upscaler you want and 2-4x to finalize the image.

Post both the image from txt2img and the final image from img2img so we can see the prompt and process.
Give a short description outlining the process and the general concept.
Also share any thoughts or reflections about things you might have discovered and learned.

The challenge will continue as long as someone is still interested.

Remember to have fun.
-------------------------------------------------------------------------------------------------------------------------------------------

In txt2img:
I will expand the prompt a little from before and see what I can achieve within these limits. I avoid using any extensions that will add to the memory demand, such as controlnet or after detailer etc in txt2img. I only use postprocessing GFPGAN as I don't think it is very demanding.

You don't have permission to view the spoiler content. Log in or register now.

In img2img:
I use after detailer for fixes and enhancing.
Lately I have experimented with using an alternative ckpt for after detailer with very interesting results.
I had to fix a tiny detail on the thumbs fingernail with photoshop for the first image.
A little "cheating" has never hurt anyone has it?..:giggle:
Then I turn off postprocessing GFPGAN and all model in after detailer with the exception of eyes before upscaling.
I upscale with SD Upscale Script with UltraSharp at 4x to finalize my image.

You don't have permission to view the spoiler content. Log in or register now.

OK, does this meet the rules? Didn't go above 4.2GB of vRAM and then only briefly.
344x512, with HiRes Fix to 1.05 (which used no extra vRAM)
You don't have permission to view the spoiler content. Log in or register now.
00138-447803216.png

Upscaled 2x using 4xNMKDSuperscale to 720x1072
Tiny bit of extra GFPGAN (0.01)
No other post-processing.
02026.png

The vRAM could almost undoubtedly be reduced further using '--low vram'

Could I do better with a bit more time? Probably! But yeah, like some other things in life, it's not how big it is, it's what you do with it that counts.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
OK, does this meet the rules? Didn't go above 4.2GB of vRAM and then only briefly.
344x512, with HiRes Fix to 1.05 (which used no extra vRAM)
You don't have permission to view the spoiler content. Log in or register now.
View attachment 3271323

Upscaled 2x using 4xNMKDSuperscale to 720x1072
Tiny bit of extra GFPGAN (0.01)
No other post-processing.
View attachment 3271321

The vRAM could almost undoubtedly be reduced further using '--low vram'

Could I do better with a bit more time? Probably! But yeah, like some other things in life, it's not how big it is, it's what you do with it that counts.
Yes. Excellent. I love it.
The "rules" are not hacked into stone, more like guidelines. The interesting part is to see what you guys can come up with without relying on memory demanding extensions and keeping it low resolution and trying to be creative and inventive. I had the idea when trying to give advice to lobotomist the guy with an intel card. What would it be like? And what could we achieve with those kind of limitations?
:)(y)
 
Last edited:

lobotomist

Active Member
Sep 4, 2017
823
725
The most important aspect is the amount of Vram. The card needs to have at the absolute minimum of 4GB for generating images. The chip itself can't be slow as snails either. There are many settings and small things you can do if you suffer from low Vram.
You can add the argument "--low vram" to the "webui-user.bat" file. In the ui itself you can set different settings in the "lightest" mode. Then keeping the resolution low when generating and low amount of steps etc to begin with until you know what you can achieve with your card. Start with 512x512 or in portrait mode (2:3) you can go below 512, such as 344x512. Then use SD Upscale Script in img2img to make the image larger. Since this is using "tiling" you can upscale by 4x. This will get you to 1376x2048. Keep it to an easier style or genre such as anime or manga. With easier I only mean in terms of hardware requirement.

Example:
You don't have permission to view the spoiler content. Log in or register now.

344x512Upscaled to 1376x2048
View attachment 3271011 View attachment 3271012

If your card can't hack it then google colab or Stable pod type services might be the option for you. It's online server based image generation. On some sites you can rent a high end cards by the hour. This means there is nothing stopping you from training your own models etc. As long as you are willing to pay the hourly fee.
did you quote me by mistake?
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
did you quote me by mistake?
No. I made the post for you. I generated example images so you can see what you might be able to do if you have at least 4GB of Vram on your intel card. If your card can't do SD then there are online sites that let you use their computer and you can rent a high end graphics card. "Inspired" by this I even started a challenge to create images with your scenario in mind.
 

lobotomist

Active Member
Sep 4, 2017
823
725
No. I made the post for you. I generated example images so you can see what you might be able to do if you have at least 4GB of Vram on your intel card. If your card can't do SD then there are online sites that let you use their computer and you can rent a high end graphics card. "Inspired" by this I even started a challenge to create images with your scenario in mind.
even the cheapest intel card that nobody buys has 4gb... you couldn't even take a second to google vram on intel cards before writing a huge wall of text? thanks i guess..
i have 8gb on my a750 which is pretty capable of doing stable diffusion, my question was mostly because I don't know if all those plugins like comfy ui are nvidia only.

Oh also the other most common intel card the a770 has 16gb of vram
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,793
even the cheapest intel card that nobody buys has 4gb...
i have 8gb on my a750 which is pretty capable of doing stable diffusion, my question was mostly because I don't know if all those plugins like comfy ui are nvidia only.
Oh also the other most common intel card the a770 has 16gb of vram
Don't expect anyone to be able to read your mind. If you want to know something specific then spit it out.

you couldn't even take a second to google vram on intel cards before writing a huge wall of text? thanks i guess..
Why so pissy when people are just trying to be helpful? It's not any ones job to help you but we do it anyways.
Don't expect anyone to fall over themselves to answer you in the future if this is how you respond.