[Stable Diffusion] Prompt Sharing and Learning Thread

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,804
Experiments with controlnet:

Before:

View attachment 2650511

After: (why do mine always look like they have tired eyebag eyes?)
View attachment 2650511
I suspect that the puffy eye has to do with the source material the checkpoint and or lora etc has been trained on.
You can try put puffy eyes or puffy eyelids in negative with 1.2 weight or more if needed.
Example:
(puffy eyes:1.2) or (puffy eyelids:1.2) .
 
  • Like
Reactions: miaouxtoo

devilkkw

Member
Mar 17, 2021
328
1,115
a1111 have done a new update (v1.3.0).
in this update we have Cross attention optimization.
I've made a test with all of this .
Setting's and time in post.

Are you using it? what are your favorite?
 

raventai

New Member
Jan 15, 2018
14
18
Sorry to interfere as a humble SD novice (and already baffled by all this body horror AI thing...), the first post (OP front page) is misleading regarding the way to implement LoRas in SD, I have lost two days trying to reconcile path problems and extension calls because of it. There is no need to use ui additional networks extension, Loras are directly supported (or git pull is your friend) and it is a breeze... . Thanks for all the effort however, it is interesting to educate oneself but I think DAZ is still far more efficient when you have precise Renpy needs.... (and we have those monster quasiNASA rigs....).
 
  • Like
Reactions: Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,804
Sorry to interfere as a humble SD novice (and already baffled by all this body horror AI thing...), the first post (OP front page) is misleading regarding the way to implement LoRas in SD, I have lost two days trying to reconcile path problems and extension calls because of it. There is no need to use ui additional networks extension, Loras are directly supported (or git pull is your friend) and it is a breeze... . Thanks for all the effort however, it is interesting to educate oneself but I think DAZ is still far more efficient when you have precise Renpy needs.... (and we have those monster quasiNASA rigs....).
Yes I agree, Daz is more consistent because you have direct control while SD is always a dice toss however SD is light years ahead in visuals and realism. Though using controllnet and openpose SD is catching up to DAZ in repeatability and consistency. Also with SD you are not forced to endure endless menus only to tweak one little thing..
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,575
3,776
Yes I agree, Daz is more consistent because you have direct control while SD is always a dice toss however SD is light years ahead in visuals and realism. Though using controllnet and openpose SD is catching up to DAZ in repeatability and consistency. Also with SD you are not forced to endure endless menus only to tweak one little thing..
Can't wait for the day where "generative" part adds a few more loops to understand that you want SD to build a 3D chara off one single 2D image, then dress/undress her, then build LORA/whatnot around her and turn her into a proper callable object that can be plugged consistently into scenes created via a similar approach.
 

me3

Member
Dec 31, 2016
316
708
a1111 have done a new update (v1.3.0).
in this update we have Cross attention optimization.
I've made a test with all of this .
Setting's and time in post.

Are you using it? what are your favorite?
you might want to hold off on updating, or the very least give it some thought depending on your setup and usage.
stuff i've found so far (that might sound kinda small at first but isn't).
images are drop into default temp folder, (which has been an issue before too going by a discussion on their git), there's a setting to change the temp dir but that doesn't work and things still just get dumped into default temp.
if you save all images by default, a copy single images get put in their usual place (what ever you got that set as), grids however doesn't, so far they only show in tmp for me. (if you're using SSDs you might want to consider the extra pointless writes)

if you open images from the UI you get the ones stored in temp, highlighting the issue that file= access isn't limited to just subfolders, meaning you can technically gain access to more important things so probably worth keeping that in mind if you got things running with any kind of remote/public access. (i doubt this actual issue is anything new)
 

devilkkw

Member
Mar 17, 2021
328
1,115
every update break something, actually i see image are stored on default tmp folder, also another problem is unable to save image as jpg, settings are ignored. maybe wait next good update. i update only because want to try these new cross attention.
 
  • Like
Reactions: Mr-Fox

modine2021

Member
May 20, 2021
433
1,444
well here we go again. another error preventing me from doing anything. google told me nothing. nothing happens after clicking generate. google gave small info but the lines are no where to be found in the .py they said to edit
RuntimeError: expected scalar type Float but found Half
 

me3

Member
Dec 31, 2016
316
708
I was hoping it might help with some of the problems i've been having (that backfired...)
Been trying to train a person for probably over 1k hours now and i can't make sense of why it's behaving as it is.

To start at the "easy" end, in the beginning when testing the training stages i got either someone with clearly an asian origin or of an african one, both in features and skin tones...eventually the few cases of asian dropped out completely.
Problem is, the person is without any doubt, is white, even the fact that they have blue eyes should remove the option being much else, so can't see why it's happening.
Another problem is that the first 1-2 stages in the training picks up the bodyshape pretty perfectly, then beyond that stuff just get flattened down.

I've tried simple captions, tagged everything and just tagged specific things, it changes stuff but nothing seem to affect the ethnicity and body issues. The captions are read and even without that it should "work".

Having trained using other image sets pretty successfully, meaning you could easy tell it's the same people, i can't see why this is going so horribly wrong...

Suggestions are welcome
 

me3

Member
Dec 31, 2016
316
708
well here we go again. another error preventing me from doing anything. google told me nothing. nothing happens after clicking generate. google gave small info but the lines are no where to be found in the .py they said to edit
RuntimeError: expected scalar type Float but found Half
not sure how you are running things, but usually that happens when you're missing the launch options --no-half or --no-half-vae
 

modine2021

Member
May 20, 2021
433
1,444
not sure how you are running things, but usually that happens when you're missing the launch options --no-half or --no-half-vae
using these. is it right? was trying to speed things up a bit

--xformers --opt-channelslast --disable-safe-unpickle --precision full --disable-nan-check --skip-torch-cuda-test --medvram --always-batch-cond-uncond --opt-split-attention-v1 --opt-sub-quad-attention --deepdanbooru --no-half-vae
 

me3

Member
Dec 31, 2016
316
708
using these. is it right? was trying to speed things up a bit

--xformers --opt-channelslast --disable-safe-unpickle --precision full --disable-nan-check --skip-torch-cuda-test --medvram --always-batch-cond-uncond --opt-split-attention-v1 --opt-sub-quad-attention --deepdanbooru --no-half-vae
Code:
--xformers 
--opt-split-attention-v1
--opt-sub-quad-attention
i think those 3 can't work together as the code is set up with conditions making only one of them actually be applied.
I'd probably go with xformers.
Since you say you're looking for "speed", if you don't need the --medvram you should remove it as that slow things down quite a bit and i think the --precision full increases vram usage so that kills a bit of the point for --medvram
 

modine2021

Member
May 20, 2021
433
1,444
Code:
--xformers
--opt-split-attention-v1
--opt-sub-quad-attention
i think those 3 can't work together as the code is set up with conditions making only one of them actually be applied.
I'd probably go with xformers.
Since you say you're looking for "speed", if you don't need the --medvram you should remove it as that slow things down quite a bit and i think the --precision full increases vram usage so that kills a bit of the point for --medvram
adding --no-half fixed it .. thanks for other suggestions
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,804
I was hoping it might help with some of the problems i've been having (that backfired...)
Been trying to train a person for probably over 1k hours now and i can't make sense of why it's behaving as it is.

To start at the "easy" end, in the beginning when testing the training stages i got either someone with clearly an asian origin or of an african one, both in features and skin tones...eventually the few cases of asian dropped out completely.
Problem is, the person is without any doubt, is white, even the fact that they have blue eyes should remove the option being much else, so can't see why it's happening.
Another problem is that the first 1-2 stages in the training picks up the bodyshape pretty perfectly, then beyond that stuff just get flattened down.

I've tried simple captions, tagged everything and just tagged specific things, it changes stuff but nothing seem to affect the ethnicity and body issues. The captions are read and even without that it should "work".

Having trained using other image sets pretty successfully, meaning you could easy tell it's the same people, i can't see why this is going so horribly wrong...

Suggestions are welcome
Are you trying to train a Lora using kohya ss? If so, the checkpoint you are training on is very important.
Here's the best info source I have found on Lora training:

For my own Lora I tried a few different ones and it landed on Elegance. One could of course try SD1.5 Base Model but I read that it's not the best for pose variations.

If you however are only talking about generating images using img2img, then put Asian and black or African in negative and white or Caucasian in positive.

It's much easier to help people if you would write more thoroughly what your issue is and the context.. ;)
 
Last edited:

me3

Member
Dec 31, 2016
316
708
  • kohya_ss, trying to train a TI.
  • Tried different learning rates and scheduer
  • Tried with and without regularisation images
  • Using sd 1.5 model, i did discover that the sd 1.5 model that kohya downloads on its own has an issue, i can't remember exactly what it was, but it was throwing a small loading error which was hard to spot in all the output text.
  • Sample images generated during training purely using the name as positive prompt outputs (obviously poorly and disfigured) likenesses of the training data, so it's clearly learning something. However when using the TI files from each epoch in a1111, on the same model, there's either a small likeness that gets washed out later on or completely wrong "thing" (like the ethnicity bit) that keeps constant through out.
    Which makes it seem like partial or none of the learned data truly gets written to the TI files, or it somehow written "wrong"
  • Given the point above, been running the training at 4 vectors. It should been enough as i've trained several TIs in a1111, one of which i posted an image of already, (first image). Running a training now at 20 vectors just to see, but considering i've trained at 4 just fine, even at 2, vector count should be enough.
And before you ask, reason why i'm trying a TI and not LORA is that they got one "ability" that LORA doesn't. LORAs apply themselves to all subjects, but TIs you can have a group photo of multiple trained people/objects.

=== Updated ===
Using a much higher vector count seem to have improved upon things, i ran the training at a much higher LR just to see so overfitting is a very possible cause of the remaining issues. Running test now at 1/50 of the LR (i said it was much higher :p).
One unexpected side effect of the higher vector count is that the sample images had a higher quality, Assuming that's due to how they are generated with each vector being given, but images has much higher detail and "quality".

Potential lesson to learn: Don't listen to all the ultimate/superawesomest/allyouneedtoknow guides and experts telling you that you never need (in this case) vectors above a low amount...
 
Last edited: