[Stable Diffusion] Prompt Sharing and Learning Thread

Sharinel

Active Member
Dec 23, 2018
598
2,509
If my math is right you where training around 4.5 sec/it at 768,768 and I am at 30 sec/it at 640,640 and 7-10 seconds per it for 256,256 (It is stretch for me to be training antyhing on a 8GB 3050 at all though)

It is up and available for

View attachment 2817177
Did I have it wrong? I haven't touched Loras, especially for XL as I was told the base model was 1024x1024. Wouldn't that mean the Loras would have to be the same?
 

felldude

Active Member
Aug 26, 2017
572
1,694
Did I have it wrong? I haven't touched Loras, especially for XL as I was told the base model was 1024x1024. Wouldn't that mean the Loras would have to be the same?
It is 1024x1024 but the short answer is no you don't have to train it with the same size image, the image is converted to a latent. For XL most people will probably train with
 
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,570
3,767
SDXL in ComfyUI

I am late to SDXL party because procrastination. If anyone reading this thread haven't looked into SDXL, here is a great started video, just watch the first 8 minutes to see what's up. To sum up: SDXL gets text better, it gets better dynamic ranges, better complex composition. And that's merely out of the box. Anyways, here is the video:



And here ComfyUI example page on how to set up the workflow:



I haven't tested the workflow yet, prolly a task for this week.
 
  • Red Heart
  • Like
Reactions: devilkkw and Mr-Fox

me3

Member
Dec 31, 2016
316
708
use other optimizers so you can run a lot less steps. dapt/prodigy works very well and you'll have a much higher rate and save alot of time
 
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
.. To sum up: SDXL gets text better, it gets better dynamic ranges, better complex composition. ...
From the SDXL page:
Limitations
  • The model does not achieve perfect photorealism
  • The model cannot render legible text
  • The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere”
somethings seems to no quite add up here...
 

felldude

Active Member
Aug 26, 2017
572
1,694
use other optimizers so you can run a lot less steps. dapt/prodigy works very well and you'll have a much higher rate and save alot of time
Adaptive has failed me 4 times with and without xformers, with and without buckets at the recommended learning rate.

I'd try reading this , but my gpu is currently and 99% I'd rather my brain not be at the same level.
 
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
Adaptive has failed me 4 times with and without xformers, with and without buckets at the recommended learning rate.

I'd try reading this , but my gpu is currently and 99% I'd rather my brain not be at the same level.
with all the hours of testing/training i've run the only "constant" things i've found is that i REALLY wish i had a better computer to run it on and that all the settings/"must use recommendations of YT etc" are basically pointless. Reason being that what works for one or two datasets doesn't work at all for a third. Thinking i'd figured out a setup that worked, as it'd been successfull on 2 datatsets, i ran it on 2 others, one apparently was a car the other went from a blue eyed blonde to a 55yo black woman...

Despite some claims the ia3 (stating that it's mainly for style) using prodigy, works surprisingly well for people, just with simple default values, even without captions and simply feeding it images.
 
Last edited:

felldude

Active Member
Aug 26, 2017
572
1,694
From the SDXL page:

somethings seems to no quite add up here...
with all the hours of testing/training i've run the only "constant" things i've found is that i REALLY wish i had a better computer to run it on and that all the settings/"must use recommendations of YT etc" are basically pointless. Reason being that what works for one or two datasets doesn't work at all for a third. Thinking i'd figured out a setup that worked, as it'd been successfull on 2 datatsets, i ran it on 2 others, one apparently was a car the other went from a blue eyed blonde to a 55yo black woman...

Despite some claims the prodify ia3 (stating that it's mainly for style), works surprisingly well for people, just with simple default values, even without captions and simply feeding it images.
One of the best LORA's I have used was listed as an all ages lora and trained on (based on the captions in the lora) nothing but hentai images....I won't call it out but I looked at the settings it used and what I noticed was weighted captions on every image
 
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
If you got the settings to share, i've got some very fun and problematic sets to try it on.
Great fun to see what i can turn some poor ppl into this time, with any luck it might be themselves :p
 
  • Haha
Reactions: Mr-Fox

felldude

Active Member
Aug 26, 2017
572
1,694
If you got the settings to share, i've got some very fun and problematic sets to try it on.
Great fun to see what i can turn some poor ppl into this time, with any luck it might be themselves :p
This is a good

Some things that stood out to me very low learing rate 0.0005
Cosine with restarts and warm up
low memory being on (I have this for XLSD but always had it off for SD 1.5)
Adam8bit, instead of Adam full FP16
20 epochs

All those setting may be meaningless without the weighted captions.

It could stand to be cleaned up but here are some the settings with examples of a caption


{"ss_cache_latents":"True","ss_caption_dropout_every_n_epochs":"0","ss_caption_dropout_rate":"0.0","ss_caption_tag_dropout_rate":"0.0","ss_clip_skip":"2","ss_dataset_dirs":""n_repeats\": 4, \"img_count\": 74}}","ss_datasets":"[{\"is_dreambooth\": true, \"batch_size_per_device\": 2, \"num_train_images\": 296, \"num_reg_images\": 0, \"resolution\": [512, 512], \"enable_bucket\": true, \"min_bucket_reso\": 256, \"max_bucket_reso\": 1024, \"tag_frequency\": {\

\"completely nude\": 23, \"sweatdrop\": 6, \"swept bangs\": 5, \"bangs\": 10, \"fingernails\": 1, \"sweat\": 20, \"large penis\": 6, \"naughty face\": 6, \"foreskin\": 1, \"bed\": 9, \"dark-skinned male\": 1, \"interracial\": 1, \"navel\": 23, \"hand on hip\": 2, \"open mouth\": 25, \"standing\": 10, \"full body\": 10, \"cleft of venus\":


img_count\": 74, \"num_repeats\": 4, \"color_aug\": false, \"flip_aug\": false, \"random_crop\": false, \"shuffle_caption\": true, \"keep_tokens.ss_epoch":"10","ss_face_crop_aug_range":"None","ss_full_fp16":"False","ss_gradient_accumulation_steps":"1","ss_gradient_checkpointing":"False","ss_learning_rate":"0.0005","ss_lowram":"True","ss_lr_scheduler":"cosine_with_restarts","ss_lr_warmup_steps":"148","ss_max_grad_norm":"1.0","ss_max_token_length":"225","ss_max_train_steps":"2960","ss_mixed_precision":"fp16","ss_network_alpha":"8","ss_network_dim":"16","ss_network_module":"networks.lora",","ss_noise_offset":"None","ss_num_batches_per_epoch":"148","ss_num_epochs":"20","ss_num_reg_images":"0","ss_num_train_images":"296","ss_optimizer":"bitsandbytes.optim.adamw.AdamW8bit","ss_output_name":"","ss_prior_loss_weight":"1.0","ss_sd_model_hash":"7fcf3871","ss_sd_model_name":"model.safetensors","
 
Last edited:
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
This is a good

Some things that stood out to me very low learing rate 0.0005
Cosine with restarts and warm up
low memory being on (I have this for XLSD but always had it off for SD 1.5)
Adam8bit, instead of Adam full FP16
20 epochs

All those setting may be meaningless without the weighted captions.

It could stand to be cleaned up but here are some the settings with examples of a caption


{"ss_cache_latents":"True","ss_caption_dropout_every_n_epochs":"0","ss_caption_dropout_rate":"0.0","ss_caption_tag_dropout_rate":"0.0","ss_clip_skip":"2","ss_dataset_dirs":""n_repeats\": 4, \"img_count\": 74}}","ss_datasets":"[{\"is_dreambooth\": true, \"batch_size_per_device\": 2, \"num_train_images\": 296, \"num_reg_images\": 0, \"resolution\": [512, 512], \"enable_bucket\": true, \"min_bucket_reso\": 256, \"max_bucket_reso\": 1024, \"tag_frequency\": {\

\"completely nude\": 23, \"sweatdrop\": 6, \"swept bangs\": 5, \"bangs\": 10, \"fingernails\": 1, \"sweat\": 20, \"large penis\": 6, \"naughty face\": 6, \"foreskin\": 1, \"bed\": 9, \"dark-skinned male\": 1, \"interracial\": 1, \"navel\": 23, \"hand on hip\": 2, \"open mouth\": 25, \"standing\": 10, \"full body\": 10, \"cleft of venus\":


img_count\": 74, \"num_repeats\": 4, \"color_aug\": false, \"flip_aug\": false, \"random_crop\": false, \"shuffle_caption\": true, \"keep_tokens.ss_epoch":"10","ss_face_crop_aug_range":"None","ss_full_fp16":"False","ss_gradient_accumulation_steps":"1","ss_gradient_checkpointing":"False","ss_learning_rate":"0.0005","ss_lowram":"True","ss_lr_scheduler":"cosine_with_restarts","ss_lr_warmup_steps":"148","ss_max_grad_norm":"1.0","ss_max_token_length":"225","ss_max_train_steps":"2960","ss_mixed_precision":"fp16","ss_network_alpha":"8","ss_network_dim":"16","ss_network_module":"networks.lora",","ss_noise_offset":"None","ss_num_batches_per_epoch":"148","ss_num_epochs":"20","ss_num_reg_images":"0","ss_num_train_images":"296","ss_optimizer":"bitsandbytes.optim.adamw.AdamW8bit","ss_output_name":"","ss_prior_loss_weight":"1.0","ss_sd_model_hash":"7fcf3871","ss_sd_model_name":"model.safetensors","
Some "issues".
You say that the learning rate is "very low", which is a bit surprising consider it's 4-5x what a lot of loras i have uses, that includes Mr-Fox's kendra lora.
Captions, you mentioned weighted captions, but there's no parameter set for it, also there's no actual caption example included (as stated), there's just a tag frequency list, not that it really matters.
It's dreambooth, which apparently make quite the difference (never had the option to really compare so can't say much more)
The clip skip suggests it's for anime so might be less of an issue there, but the rank is far too low to be able to fit much data in to it.
Having done multiple runs there's clear signs there's too little "room"
 
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,570
3,767
I tried SDXL 1.0, and it runs prohibitively slow on my 6gb GTX 1660ti. The image below took 80 minutes to render. Meaning in a day I can render ~18 SDXL images. And this is using Euler sampler. I can't imagine how long Heun would take. I might run one for lolz.

Not that anyone owes me anything, but this is bullshit, I want to see the manager, I want my fucking money bek.

So, yeah, no.

ComfyUI_00002_.png
 
Last edited:
  • Haha
  • Wow
Reactions: DD3DD and Mr-Fox

Synalon

Member
Jan 31, 2022
225
663
I tried SDXL 1.0, and it runs prohibitively slow on my 6gb GTX 1660ti. The image below took 80 minutes to render. Meaning in a day I can render ~18 SDXL images. And this is using Euler sampler. I can't imagine how long Heun would take. I might run one for lolz.

Not that anyone owes me anything, but this is bullshit, I want to see the manager, I want my fucking money bek.

So, yeah, no.

View attachment 2822015
Did it take that long using comfyui?
 
  • Like
Reactions: Sepheyer

tacoBlade

New Member
Jan 11, 2023
2
4
I tried once and generated something i may never be able to generate again whatsoever lol. image (2).png I have started playing with local sd (billion different models on civitai are sooooooo good, loras, adetailer, inpainting) but i can never recreate these. especially this image (25).png .That blue veil/cloud/magical thing I can never generate again is just so surreal.
 

rogue_69

Newbie
Nov 9, 2021
87
298
Daz to Stable Diffusion to Flow Frames. Created an image in Stable. Used it to do a Face Transfer in Daz (used Face Transfer Shapes, which is a must have for Face Transfer). This got me a basis for the face shape. Rendered a video with the Face Transfered character (with a neutral looking skin, since I wanted Stable to handle the textures). Used the same prompts for the Mov2Mov video in Stable that I used for the Face Transfer reference image. The combination of the Daz Character face shape, with the prompts seems to get me a really consistent character. I also rendered hair and clothing as a separate canvas, so I could easily just plop those on her at any time for the video.
ezgif.com-video-to-gif.gif