[Stable Diffusion] Prompt Sharing and Learning Thread

felldude

Active Member
Aug 26, 2017
500
1,476
This is my 10th 2k refinement of pony.
(Not native I wish, I can native tune 1.5 Models but it takes days)

As the result of multiple 2k trainings I now have a model that can produce this image by
1280x1280 text to image, 1280x1280-1600x1600 image to image at .65, then 1600x1600-2048x2048 at .5

The image is 100% AI and not some real image run at .01

ComfyUI_00700_.png

It does well with concepts it was not directly trained in also

ComfyUI_00715_.png
 
Last edited:
  • Like
  • I just jizzed my pants
Reactions: Dagg0th and EvylEve

XenoXe

New Member
Sep 8, 2021
9
4
This is my 10th 2k refinement of pony.
(Not native I wish, I can native tune 1.5 Models but it takes days)

As the result of multiple 2k trainings I now have a model that can produce this image by
1280x1280 text to image, 1280x1280-1600x1600 image to image at .65, then 1600x1600-2048x2048 at .5

The image is 100% AI and not some real image run at .01

View attachment 3659095

It does well with concepts it was not directly trained in also

View attachment 3659173
Bro the quality are insane how can i learn to make quality lora like these i've been learning loras and stuffs for like a year now and pretty decent at it but i've never come across a quality this good. Bruh Teach ME!
 
  • Like
Reactions: felldude

Ludu

New Member
Jun 7, 2021
13
3
A quick question, what would I do to have the same character in the same pose but wearing different outfits? Tried to do it through image2image inpainting but it didn't seem to work.
 

felldude

Active Member
Aug 26, 2017
500
1,476
Bro the quality are insane how can i learn to make quality lora like these i've been learning loras and stuffs for like a year now and pretty decent at it but i've never come across a quality this good. Bruh Teach ME!
The dataset is key. Most of my SD Loras use 100 or so images now. 1 or 2 "bad" images may not reinforce bad learning in a set of that size but 10% will.

For example I took 100 random images from the old HQ-Faces data set and substituted 10 of them with the face of the Victoria 3D model. Rather then the 90 outweighing the 10...presumably because the 10 where all similar faces with different angles and expressions it biased the lora. (The clip was marked for 3D for those 10, and realistic for the 90)

My understanding is to native fine tune with Adam on even a 1.5 checkpoint you need 32MB of VRAM.
I can native fine tune with Ada or Lion on 1.5 but I have never had a refinement that looked good.
Juggernaut, Realisticvision, Epicrealism are all clearly based off of the same refinement. (I don't know who did the first)

With XL I am using LORA merges as only the big boys with their 8 A1000's can native fine tune.
(With Adam you could probably do lion or Ada with a 32GB card)

Even with the Lora's I have around 16GB of training data that I play around with the weighting and rank until I get a model that doesn't have catastrophic loss (PONY already has major loss)
 

XenoXe

New Member
Sep 8, 2021
9
4
The dataset is key. Most of my SD Loras use 100 or so images now. 1 or 2 "bad" images may not reinforce bad learning in a set of that size but 10% will.

For example I took 100 random images from the old HQ-Faces data set and substituted 10 of them with the face of the Victoria 3D model. Rather then the 90 outweighing the 10...presumably because the 10 where all similar faces with different angles and expressions it biased the lora. (The clip was marked for 3D for those 10, and realistic for the 90)

My understanding is to native fine tune with Adam on even a 1.5 checkpoint you need 32MB of VRAM.
I can native fine tune with Ada or Lion on 1.5 but I have never had a refinement that looked good.
Juggernaut, Realisticvision, Epicrealism are all clearly based off of the same refinement. (I don't know who did the first)

With XL I am using LORA merges as only the big boys with their 8 A1000's can native fine tune.
(With Adam you could probably do lion or Ada with a 32GB card)

Even with the Lora's I have around 16GB of training data that I play around with the weighting and rank until I get a model that doesn't have catastrophic loss (PONY already has major loss)
I appriciate that you're helping me out here. And also i have a few question in my thoughts. I'm not fully understanding the Process of native Finetuning. And all i've ever done was to train a LORA from the ground up by using the custom data with the base model. And also the Process of Lora Merging intrigue me caz i've never seen a useful guide nor understand how it work. But i know that in the cases like your's it's like a top tier lora. I've tested it and found there it no style bleeding nor artifacting in the image and worked well with most Pony based Checkpoints and on top of that the quality is insane.I want to learn this GODLY technique from you.

And also i've been focusing mainly on the pony XL model for now and i know it have some issues with the training unlike the SD1.5. From my understanding of finetuning there are only like 2 option in koya trainer the Dreambooth Finetuning and Finetuning Using LORA. But i've never tested those. And I have a plan to training an anime series with a maximum of 6 character for with all the style and stuffs but training it on just LORA will definitely overfit in one way or another. So i wanted to hear some advices from you that would suggest me a better solution.
 
  • Like
Reactions: felldude

felldude

Active Member
Aug 26, 2017
500
1,476
I appriciate that you're helping me out here. And also i have a few question in my thoughts. I'm not fully understanding the Process of native Finetuning. And all i've ever done was to train a LORA from the ground up by using the custom data with the base model. And also the Process of Lora Merging intrigue me caz i've never seen a useful guide nor understand how it work. But i know that in the cases like your's it's like a top tier lora. I've tested it and found there it no style bleeding nor artifacting in the image and worked well with most Pony based Checkpoints and on top of that the quality is insane.I want to learn this GODLY technique from you.

And also i've been focusing mainly on the pony XL model for now and i know it have some issues with the training unlike the SD1.5. From my understanding of finetuning there are only like 2 option in koya trainer the Dreambooth Finetuning and Finetuning Using LORA. But i've never tested those. And I have a plan to training an anime series with a maximum of 6 character for with all the style and stuffs but training it on just LORA will definitely overfit in one way or another. So i wanted to hear some advices from you that would suggest me a better solution.
Native Fine tuning would be the closest thing to training a checkpoint from the ground up.
Dreambooth training is close to native fine tuning only your creating a lora rather then training the entire checkpoint.


Ada factor, lion and prodigy are considered inferior to Adam however the resources needed are also less.
You can native fintune a SD 1.5 model with Lion with only 8GB of VRAM Adam quadruples that.

When fine tuning you also use the EMA weights vs the inference non ema weights checkpoint
(The SD 1.5 EMA is almost 8GB)

So an XL checkpoint like ends up around 13GB something most people can't even load let alone train with.
(Non EMA Juggernaut X is only 7.1GB)

For XL models both those options are out of reach for most people so Ill focus on the basic Lora.

Civitai can train a basic lora for PONY up to 10,000 steps (50,000 to 80,000 repeats depending on batch size limit).
For pony it has a batch size of 5 so that allows for lora's with an image count of 400-800 to train to convergence.

My rig would be 30/secs per it for a 5 batch size on 1024x1024, it will not train at 2048x2048
(Thus all but 2 of my XL trainings have been done by Civitai)

You can train a high rank lora then adjust it using Kohya, I never merge at full wieght with a lora it is usally .1-.2.
You can rank a lora down if it is over fitting. (You can rank up but it is not advised)

So in short with Civati allowing 2k training and around the number of steps needed to refine a checkpoint (100k or so) you can merge a high quaility lora into a checkpoint.

Extensive testing should be done before hand to make sure your not causing catastrophic loss to the model. I would never merge my tanlines model as I could not find enough images to make a model that doesnt create a beach scene when you ask for a cityscape.
 
Last edited:
  • Like
Reactions: devilkkw and XenoXe

XenoXe

New Member
Sep 8, 2021
9
4
Native Fine tuning would be the closest thing to training a checkpoint from the ground up.
Dreambooth training is close to native fine tuning only your creating a lora rather then training the entire checkpoint.


Ada factor, lion and prodigy are considered inferior to Adam however the resources needed are also less.
You can native fintune a SD 1.5 model with Lion with only 8GB of VRAM Adam quadruples that.

When fine tuning you also use the EMA weights vs the inference non ema weights checkpoint
(The SD 1.5 EMA is almost 8GB)

So an XL checkpoint like ends up around 13GB something most people can't even load let alone train with.
(Non EMA Juggernaut X is only 7.1GB)

For XL models both those options are out of reach for most people so Ill focus on the basic Lora.

Civitai can train a basic lora for PONY up to 10,000 steps (50,000 to 80,000 repeats depending on batch size limit).
For pony it has a batch size of 5 so that allows for lora's with an image count of 400-800 to train to convergence.

My rig would be 30/secs per it for a 5 batch size on 1024x1024, it will not train at 2048x2048
(Thus all but 2 of my XL trainings have been done by Civitai)

You can train a high rank lora then adjust it using Kohya, I never merge at full wieght with a lora it is usally .1-.2.
You can rank a lora down if it is over fitting. (You can rank up but it is not advised)

So in short with Civati allowing 2k training and around the number of steps needed to refine a checkpoint (100k or so) you can merge a high quaility lora into a checkpoint.

Extensive testing should be done before hand to make sure your not causing catastrophic loss to the model. I would never merge my tanlines model as I could not find enough images to make a model that doesnt create a beach scene when you ask for a cityscape.
Speaking of finetuning how do you do the training like kohya_ss or One traininer ect to get that native finetuning.
 

deadshots2842

Member
Apr 30, 2023
159
272
Also can someone tell me how I can uninstall everything I installed during kohya because it's giving me error. And I don't want my pc filled with useless file I don't use. Anyone?
 

felldude

Active Member
Aug 26, 2017
500
1,476
Also can someone tell me how I can uninstall everything I installed during kohya because it's giving me error. And I don't want my pc filled with useless file I don't use. Anyone?
If you did not use a VENV then the python files where copied directly into one of 20 versions of python you might be using.
You would also need to delete the cache folder which contains the "install files"
 

felldude

Active Member
Aug 26, 2017
500
1,476
Anyone else find euler to draw better hands then dpmpp_2m or 3m (When doing img to img up scaling esp)

ComfyUI_00075_.png
 

rogue_69

Newbie
Nov 9, 2021
78
235
I've been using KREA AI for the last few months, and they finally released their TOS, which has a no "Lewd/Pornographic" clause in it. Looks like I'm going to have to learn to use ComfyUI. Where would be the best place to start to learn? Are there any good starter tutorials someone can suggest?
 

deadshots2842

Member
Apr 30, 2023
159
272
If you did not use a VENV then the python files where copied directly into one of 20 versions of python you might be using.
You would also need to delete the cache folder which contains the "install files"
I found something called pip with about 7 gb of data is that it?
 

felldude

Active Member
Aug 26, 2017
500
1,476
Have you ever tried regularization for the training? I know that it prevent the style bleed effect but idk what kinda of image are valid for it.
It is a second step on each iteration to "temper" the result. I've heard it helps with CLIP data not getting extremely corrupted. Let say you had an apple in 50% of your lora images that was drawn at an angle for some reason. But you used WD14 and it tagged apple so the CLIP is strongly being trained on apple.

Maybe just use -apple in your tagger
Or if you have a lot of images 1000 or so and you notice 50-100 instance of apple compared to 999 1girls
maybe through in some regularization photos of an apple. Or 1girl holding an apple.
The issue would be you need regularization for all subjects.

Contrastive Language–Image Pre-training - CLIP

If you have ever had a lora that is garbled when the CLIP is used but works fine when it is disconnected that is likely from bad text encoder training.
And could be from over describing in the CLIP without regularization
It could also be from to high a Text Encoder Learn rate but that is less likely as most people reduce the TE

Their have been claims of teaching an art style with 10 photos of an apple in the style and 10 regularization images of a photo of an apple.
I haven't tried it but my intuition says it would just be a lora that could draw an apple


I found something called pip with about 7 gb of data is that it?
Likely but I would recommend at minimum knowing what version of Python you used to install (Conda, Windows App store Python, Python from Python.org) before you start deleting folders.

You can also look up where the cache is stored as it should be a few GB also
 
Last edited:

deadshots2842

Member
Apr 30, 2023
159
272
Since I can't make superwoman how I want I'm thinking of using Reactor extension. Can anyone tell me if it's worth trying and does it work on 18+ images like can I turn scarlet Johansson face with anyone?