[Stable Diffusion] Prompt Sharing and Learning Thread

Mr-Fox · Sep 7, 2023

Sepheyer said:
Bros, I'll be grateful for any corrections/additional tips to put into this post:
---
Troubleshooting LORA Training

So, took me a few times to successfully train LORA, part I am a moron, part - older hardware.

First, do rule out the issues with the dataset and use Schlongborn's dataset included in his LORA training post. This dataset works, and given it has only 20 images, you are guaranteed to waste minimal time while troubleshooting. Also, his post includes LORA that you can check against as reference using values 0.7 for model and 1.0 for clip. Here is a ComfyUI workflow that you can just plug and play:

You don't have permission to view the spoiler content. Log in or register now.

Now, if you train a LORA on that dataset, this is what can go wrong:

Getting black render - you used " Network Rank (Dimension)" with value of 1. I am a moron because Schlongborn's post says use 128, but I overlooked it. For some reason "1" is the default for Kohya's September 2023 install and with all those dials I just missed it. Make sure to use at least 128 for this parameter on your initial tries. Same for "Network Alpha", make it 128. I don't know if 128/1 or somesuch will work, I just know that 128/128 works. Why the default is 1/1 is beyond me. Interestingly, this does affect the size of the LORA. The 1/1 gives you around ~10mb, while 128/128 gives you a ~150mb LORA.

Getting unresponsive LORA - i.e. you get images rendered, but you can't tell if it worked because nothing looks like what you'd expect. That's because the training didn't work out. Here's what's up, when LORA trains, the prompt will tell you there is a loss, like this:

View attachment 2909179

And if you are getting "loss=NaN" then the LORA gets zeroes for weights. What likely causes this is the "Mixed precision" setting. It should be "no", because your hardware probably doesn't support fp16 or bf16 options for whatever reason. It actually might support it, but given Kohya uses a bunch of third party modules, one of these modules might just incorrectly identify what you have. So, set "Mixed precision=no" and restart the training: if you start having loss equal to some number, you probably fixed the issue. Strangely, "Save precision-fp16" is fine.

Verify LORA. Kohya has a tool - you can check either your own LORA, or whatever LORA you downloaded. Bad LORA's output section will look different and will have zeroes all over the place:

View attachment 2909185

Don't forget or overlook the awesome Lora training guide on rentry I often link to. It was a huge help for me and it gets updated on regular basis as new knowledge, tools and other development progresses. So when something new comes out he usually updates his guide with a section about this new thing and he follows up with his conclusion after doing tests etc.
I have not seen anything remotely close to this guide anywhere. Most just post their "guide" and abandon it next minute.

You must be registered to see the links

Sepheyer · Sep 7, 2023

Mr-Fox said:
Don't forget or overlook the awesome Lora training guide on rentry I often link to. It was a huge help for me and it gets updated on regular basis as new knowledge, tools and other development progresses. So when something new comes out he usually updates his guide with a section about this new thing and he follows up with his conclusion after doing tests etc.
I have not seen anything remotely close to this guide anywhere. Most just post their "guide" and abandon it next minute.

You must be registered to see the links

Sweet! I added it to the guides section in the original post.

Sepheyer · Sep 7, 2023

Mr-Fox said:
Don't forget or overlook the awesome Lora training guide on rentry I often link to. It was a huge help for me and it gets updated on regular basis as new knowledge, tools and other development progresses. So when something new comes out he usually updates his guide with a section about this new thing and he follows up with his conclusion after doing tests etc.
I have not seen anything remotely close to this guide anywhere. Most just post their "guide" and abandon it next minute.

You must be registered to see the links

Sweet! I added it to the guides section in the original post.

Mr-Fox · Sep 7, 2023

me3 said:
a simple way to look at rank is like a memory allocation, as a set size for what you're trying to train and how much "details" you can fit. Bit of an issue is judging how much you actually need. Not every type of thing you can train will need the same amount and while having too much mostly only means a bigger file to share/store, it can also mean that the training fits in more things you don't want to

One way to notice too low rank is to save version through out the training and generate images on them afterwards. Putting them in a grid you'll see if concepts suddenly start getting forgotten/replaced as the training progress. That implies that what ever you're trying to train is filling up the "space" you got and new is pushing out old.

If you run you training at pretty low rates and have regular "checkpoints", IE every epoch and low repeat count, you should see a point where images generated would be almost identical over multiple checkpoints. With the correct rates and settings for your training data you should technically not ever really "overtrain". Lower rates and lower repetition count is far far better than rushing things.

The list of things that can cause "nothing seemed to be learned" is very long. Going from the very obvious image/caption related to more fun things like bugs in the code. Which is a real pain to figure out. Network rank can also cause it.
One "fast" way to see if the training is working is to use the sample images, yes it will slow down training slightly since it need to stop to create the image. However if it means spending 10-20 sec to see if the training is progressing as expected or waiting an hour to see that it failed from the start, it's worth the delay. Sample images usually won't be perfect but you should easily see that it's at least close to expected

I agree with everything you said, very good info. The most important thing I learned is to spend enough time preparing the source images you will train on and the prompt for each. Though I have read that the training is way more sensitive to a bad prompts over bad images, it was very obvious the difference when I got better images. Also a slow learning rate is fundamental to a good training. In addition to slow learning rate settings there are tricks to "dampen" the training rate with other settings that will have a secondary retarding effect. The guide I linked to have good info about this.
I used a little " noise_offset " that gives the images more dynamic range, more colorful and better details with the secondary effect that it dampens the learning rate.

You must be registered to see the links

Artiour · Sep 7, 2023

so the other day I made a Jessa Rhoads LoRA with JessaR as trigger word, it took me some time to realize "Jessar" might be some Indian or Pakistani name

so this time I remade it with Jessa_Rhoads as a trigger word, I swear to God I wrote nothing about a guitar or a band nor I used an image with a guitar in it and this is the result

also there is something with the trigger word Blonka (short for blonde Kasie), I only tagged one image Blonka and the result was the girl in my avatar picture, all the images generated with Blonka had that kind of heavy makeup and glowing eyes, I created other LoRAs with the same thing (a single image tagged Blonka) and had same result
any explanation for this phenomena?

Sepheyer · Sep 7, 2023

Gents, would anyone have any guides about "adding" to LORA? Say I have a LORA with 20 images, then I want to keep adding another 20 images later and so on, until I get full 100 images in, but I would do so incrementally over a course of a week.

Why is this important? Say I want to conduct experiments:

- Check what happens if LORA's training dataset lacks the actual subject but only the backgrounds from the images that do contain the subject. Naturally, such images would not have the LORA's nametag.

- Check what happens if one LORA is used to contain multiple subjects each with their own tags. Say 20 images tagged "Sophia" and 20 images tagged "Kendra". Would I be able to use this one master LORA to successfully have all my girls in?

And thus incrementally adding to one's LORA would allow to test for these incremental additions using significantly less time.

me3 · Sep 7, 2023

Sepheyer said:
Gents, would anyone have any guides about "adding" to LORA? Say I have a LORA with 20 images, then I want to keep adding another 20 images later and so on, until I get full 100 images in, but I would do so incrementally over a course of a week.

Why is this important? Say I want to conduct experiments:

- Check what happens if LORA's training dataset lacks the actual subject but only the backgrounds from the images that do contain the subject. Naturally, such images would not have the LORA's nametag.

- Check what happens if one LORA is used to contain multiple subjects each with their own tags. Say 20 images tagged "Sophia" and 20 images tagged "Kendra". Would I be able to use this one master LORA to successfully have all my girls in?

And thus incrementally adding to one's LORA would allow to test for these incremental additions using significantly less time.

Not sure what you mean by "adding to".
IE

you train a lora on 20 images, then later you select that lora as starting point and continue the training with
1. 20 new images OR
2. the 20 old images along with 20 new ones (totaling 40 images)

1.2 would probably better off just retraining from scratch since doubling the amount of images would potentially change your learning quite a bit.

With 1.1 i suspect it would depend on how images were captioned, but i think it would "confuse" the AI. Generally the training picks up little to noting of the background unless there's "room left" to do so and "there's nothing left to learn" with the subject. Your images should have as different backgrounds as possible in your training set anyway which makes it less likely for the AI to associate background elements with the subject you're training.

As for multiple subjects in one lora, that should work just fine, assuming you can keep it all within the "size limits" of the lora. You have loras with subject wearing different clothing or styles, i can't remember seeing any with completely different people, but the training setup for it would be the same. The different image sets would just be in different folders and tagged accordingly.
(offtopic: One question is how "efficient" weights are being stored in the lora. IE if you have one character with and without a hat, would it then store just the differences between them or would it be stored as 2 complete "weight sets")
You'd probably need to make sure your sets are rather close in how fast the AI picks them up though, you can probably fix some of it with having different repeat counts for each set up just because repeats * images is the same it doesn't mean it'll learn at the same.
Through testing it seems that even running the same set at something like 4 repeats for 4 epochs doesn't give the same end result as doing 2 reps for 8 epochs. This could be due to "something" in the code itself that make it behave slightly different, i'd rather not dig into all that. If you're lucky enough to do batches as well, that throws in another curve ball apparently, anyway....

I'd planed to put multiple subjects into one lora but not gotten around to it yet as i were struggling to find a common setting so they'd all learn at roughly the same rate. I considered training one lora, on one set then "append" a new set to it, i started on it with the lora i posted a while back, but an update broke things so i can't even launch the trainer and i've not gotten back to it yet.
I remember seeing something about lora merging at the time too but not sure how that works

rogue_69 · Sep 7, 2023

Sepheyer said:
Thanks, good to know so one wouldn't voluntarily admit using DAZ to train a model/LORA and never to make public the source material for the LORA. Outside of self-outing the "Daz Productions, Inc." won't be able to prove in court that one violated the ToS.

I was messing around with training Daz Loras for a while, but I came up with a better workflow. Use a Daz Gen 8 headshot in Stable with prompts, take an image you like from that and use it with Face Transfer in Daz (use Face Transfer Shapes if you have it). Either used the texture from the Face Tranfer or use another texture, you're mainly just using the new Daz model you created for nose, mouth, eyes, and face consistency in Stable. Render out a bunch of images in Daz and bring them over to Stable and make images using the same prompts you used before.
Here is a quick example I threw together because I was bored. If I put more work into it, I could have gotten even more consistent results.

Sepheyer · Sep 7, 2023

rogue_69 said:
I was messing around with training Daz Loras for a while, but I came up with a better workflow. Use a Daz Gen 8 headshot in Stable with prompts, take an image you like from that and use it with Face Transfer in Daz (use Face Transfer Shapes if you have it). Either used the texture from the Face Tranfer or use another texture, you're mainly just using the new Daz model you created for nose, mouth, eyes, and face consistency in Stable. Render out a bunch of images in Daz and bring them over to Stable and make images using the same prompts you used before.
Here is a quick example I threw together because I was bored. If I put more work into it, I could have gotten even more consistent results.
View attachment 2910699
View attachment 2910713
View attachment 2910707 View attachment 2910710

Actually already, as is, this method shows unparalled consistency. Please can you elaborate what you mean by: "Use a Daz Gen 8 headshot in Stable with prompts". I am not a DAZ guy, so I struggle parsing if you use "DAZGEN8" as a token in the SD prompt, or if you meant something else.

rogue_69 · Sep 7, 2023

The Daz Gen 8 is a specific model that works with Face Transfer in Daz Studio. For this method to work, you would need to use some kind of 3D modeling software (blender, Daz, etc.). These allow you to pose, dress, change the hair, change expressions, and use the renders you get from these 3D models with IMG2IMG to get consistent characters. If you're interested, check out some YouTube videos on 3D modeled characters and see if it's something you're interested in learning how to do. You can get started with Daz Studio for free, and be able to use it to get some decent results with my workflow.

(Edit: Think of this workflow as using Stable Diffusion to re-texture Daz Studio images, rather than creating new images).

Mr-Fox · Sep 7, 2023

me3 said:
Not sure what you mean by "adding to".
IE

you train a lora on 20 images, then later you select that lora as starting point and continue the training with

20 new images OR

the 20 old images along with 20 new ones (totaling 40 images)

1.2 would probably better off just retraining from scratch since doubling the amount of images would potentially change your learning quite a bit.

With 1.1 i suspect it would depend on how images were captioned, but i think it would "confuse" the AI. Generally the training picks up little to noting of the background unless there's "room left" to do so and "there's nothing left to learn" with the subject. Your images should have as different backgrounds as possible in your training set anyway which makes it less likely for the AI to associate background elements with the subject you're training.

As for multiple subjects in one lora, that should work just fine, assuming you can keep it all within the "size limits" of the lora. You have loras with subject wearing different clothing or styles, i can't remember seeing any with completely different people, but the training setup for it would be the same. The different image sets would just be in different folders and tagged accordingly.
(offtopic: One question is how "efficient" weights are being stored in the lora. IE if you have one character with and without a hat, would it then store just the differences between them or would it be stored as 2 complete "weight sets")
You'd probably need to make sure your sets are rather close in how fast the AI picks them up though, you can probably fix some of it with having different repeat counts for each set up just because repeats * images is the same it doesn't mean it'll learn at the same.
Through testing it seems that even running the same set at something like 4 repeats for 4 epochs doesn't give the same end result as doing 2 reps for 8 epochs. This could be due to "something" in the code itself that make it behave slightly different, i'd rather not dig into all that. If you're lucky enough to do batches as well, that throws in another curve ball apparently, anyway....

I'd planed to put multiple subjects into one lora but not gotten around to it yet as i were struggling to find a common setting so they'd all learn at roughly the same rate. I considered training one lora, on one set then "append" a new set to it, i started on it with the lora i posted a while back, but an update broke things so i can't even launch the trainer and i've not gotten back to it yet.
I remember seeing something about lora merging at the time too but not sure how that works

I have experimented just a little with Lora merging but had no good results from it. Probably would need more testing to reach any conclusion. Maybe I did it wrong, I merged different epochs, in other words the same Lora but with different amounts of epochs. I did this because it was what I had on hand atm and I wanted to see if I could get the best out of them. When you merge Loras you choose the merging ratio, lets say 0,6 of a and ofc 0,4 will be for b. An interesting test would be to merge the same Lora but that has been trained on different checkpoints, and/or different learning rates or other settings that yields different results, trying to get the best of both. For training multiple concepts or subjects, as far as I've read, you need to make sure that it's a mild/soft training for all so not one will be dominant and take over. The guide on rentry talks about this. I guess an even slower training is needed, perhaps fewer steps per epochs. Something else that would be interesting to try is something called "distill", it means training a Lora from the difference between two models.
The guides section on this is a bit murky and hard to understand though. My take on it is that you create a checkpoint from the difference of two models and then train a Lora on it. It involves setting up a virtual env, something I'm interested in but have no idea how it works.

You must be registered to see the links

Mr-Fox · Sep 7, 2023

rogue_69 said:
I was messing around with training Daz Loras for a while, but I came up with a better workflow. Use a Daz Gen 8 headshot in Stable with prompts, take an image you like from that and use it with Face Transfer in Daz (use Face Transfer Shapes if you have it). Either used the texture from the Face Tranfer or use another texture, you're mainly just using the new Daz model you created for nose, mouth, eyes, and face consistency in Stable. Render out a bunch of images in Daz and bring them over to Stable and make images using the same prompts you used before.
Here is a quick example I threw together because I was bored. If I put more work into it, I could have gotten even more consistent results.
View attachment 2910699
View attachment 2910713
View attachment 2910707 View attachment 2910710

I just use roop instead with images of the web. You can edit them and prepare them in photoshop or similar first. This is what I did with the large post of all the celebrities a little while back.
https://f95zone.to/threads/stable-diffusion-prompt-sharing-and-learning-thread.146036/post-11409142

Mr-Fox · Sep 7, 2023

rogue_69 said:
The Daz Gen 8 is a specific model that works with Face Transfer in Daz Studio. For this method to work, you would need to use some kind of 3D modeling software (blender, Daz, etc.). These allow you to pose, dress, change the hair, change expressions, and use the renders you get from these 3D models with IMG2IMG to get consistent characters. If you're interested, check out some YouTube videos on 3D modeled characters and see if it's something you're interested in learning how to do. You can get started with Daz Studio for free, and be able to use it to get some decent results with my workflow.

(Edit: Think of this workflow as using Stable Diffusion to re-texture Daz Studio images, rather than creating new images).

The face transfer is not specific to one Daz generation, it's simply a tool or function in daz studio. The daz generations is the basemodels that the developers create as they update daz studio, this is then what everything else i based on, now they have Gen 9. You can use a front image and a side image (profile) and use something like

You must be registered to see the links

to create a 3d model of that face and import it to daz studio with face transfer. It creates a morph for this face and you also import the textures that

You must be registered to see the links

creates.

Facegen:

Example:

For this render I used this face:

Roop is a much simpler approach though and it does a very good job even in different poses.

Mr-Fox · Sep 7, 2023

Just a link for a post I made about roop and the code edit you need to turn of the nsfw filter.
https://f95zone.to/threads/stable-diffusion-prompt-sharing-and-learning-thread.146036/post-11443616

A few examples:

London Keys	Alicia Vikander	Jennifer Lawrence	Gabrielle Union

I used roop with img2img, also known as photobashing. The face is only copy pasted over the original with this method so the shape is not changed. However when you use roop with txt2img roop is included in the generative process so the face shape is changed. I guess with a higher denoising setting you could possibly get a better result with img2img, I only did this very fast as an example.

Mr-Fox · Sep 7, 2023

Another example of using roop but with txt2img instead of img2img. I used the same pose with openpose and the same promt and seed, so the likeness isn't as good as it could be. Again just a very quick example.

London Keys	Alicia Vikander	Jennifer Lawrence	Gabrielle Union

Sepheyer · Sep 7, 2023

I might change this later, this is my "first impressions".
---
Do HoneySelect2 LORAs make sense?

(WTF is HoneySelect2: DAZ for smoothbrains.)

Rather yes, but with caverats. I mean with caveats. So, I just trained LORA off an HS2 chara Sina. The SD renders using this LORA require time figuring out what works: the LORA was not immediately plug-n-play, I nearly wrote it off at first. It requires going through various checkpoints, samplers and resolutions to determine which ones work good with this LORA.

Here's what the HoneySelect2 model looks like, there were 58 images in total, took ~16 hours (although the original estimate was 28 hours). The images LORA was trained on are all 512x512 with WD14 automatic caption. At first I manually amended the captions removing some tags, adding others. Then I said "fuck it", and re-run overwriting the manual inputs by using completely automatic default captioning as I wanted to see what an entirely automatic captioning would look like.

(Mr-Fox you are no longer collecting royalties for that white-and-yellow workout outfit, right? Good.)

The SD 512x512 renders of this LORA are straight up trash, but for unrelated reasons. The issue is the face restore module where the somewhat covered face makes face restore glitch on such a small resolution.

You don't have permission to view the spoiler content. Log in or register now.

But the renders improve as the resolution doubles:

You don't have permission to view the spoiler content. Log in or register now.

PS I might be wrong about the following conclusion. The LORA SophiaCollegeFace (discussed in posts above) was trained on a variety of render sizes: 512x512, 768x512, etc. And the LORA didn't render really well in higher resolution - at least out of five renders none were usable, although the face and breast came out great:

You don't have permission to view the spoiler content. Log in or register now.

But this HS2 LORA trained exclusively on 512x512 renders has a really high success ratio specifically for higher resolution renders, although has completely crappy 512x512 renders. There pretty much I got a 10/10 hit ratio.

And finally, there is an issue of background contamination, where the checkpoint's background gets overwritten by the LORA's background. See for yourself, the greenery are the low-quality legacy of the HoneySelect2 renders:

You don't have permission to view the spoiler content. Log in or register now.

Let me know if you have any ideas how to fix this and restore the checkpoint's original backgrounds. The one solution I can think of is to avoid using such images in the training dataset, and instead ensure that the entire dataset has rather rare or abstract backgrounds.

Mr-Fox · Sep 7, 2023

Sepheyer said:
I might change this later, this is my "first impressions".
---
Do HoneySelect2 LORAs make sense?

(WTF is HoneySelect2: DAZ for smoothbrains.)

Rather yes, but with caverats. I mean with caveats. So, I just trained LORA off an HS2 chara Sina. The SD renders using this LORA require time figuring out what works: the LORA was not immediately plug-n-play, I nearly wrote it off at first. It requires going through various checkpoints, samplers and resolutions to determine which ones work good with this LORA.

Here's what the HoneySelect2 model looks like, there were 58 images in total, took ~16 hours (although the original estimate was 28 hours). The images LORA was trained on are all 512x512 with WD14 automatic caption. At first I manually amended the captions removing some tags, adding others. Then I said "fuck it", and re-run overwriting the manual inputs by using completely automatic default captioning as I wanted to see what an entirely automatic captioning would look like.

View attachment 2910747 View attachment 2910748 View attachment 2910749

(Mr-Fox you are no longer collecting royalties for that white-and-yellow workout outfit, right? Good.)

The SD 512x512 renders of this LORA are straight up trash, but for unrelated reasons. The issue is the face restore module where the somewhat covered face makes face restore glitch on such a small resolution.

You don't have permission to view the spoiler content. Log in or register now.

But the renders improve as the resolution doubles:

View attachment 2910801

You don't have permission to view the spoiler content. Log in or register now.

PS I might be wrong about the following conclusion. The LORA SophiaCollegeFace (discussed in posts above) was trained on a variety of render sizes: 512x512, 768x512, etc. And the LORA didn't render really well in higher resolution - at least out of five renders none were usable, although the face and breast came out great:

You don't have permission to view the spoiler content. Log in or register now.

But this HS2 LORA trained exclusively on 512x512 renders has a really high success ratio specifically for higher resolution renders, although has completely crappy 512x512 renders. There pretty much I got a 10/10 hit ratio.

And finally, there is an issue of background contamination, where the checkpoint's background gets overwritten by the LORA's background. See for yourself, the greenery are the low-quality legacy of the HoneySelect2 renders:

You don't have permission to view the spoiler content. Log in or register now.

Let me know if you have any ideas how to fix this and restore the checkpoint's original backgrounds. The one solution I can think of is to avoid using such images in the training dataset, and instead ensure that the entire dataset has rather rare or abstract backgrounds.

(Mr-Fox you are no longer collecting royalties for that white-and-yellow workout outfit, right? Good.)

Royalties?

Never got jack doo doo for anything I have created, other than likes and possibly good will.

Don't forget to try out your Lora at different weights. Keep the prompt very simple. Try to exclude as many variables as possible. When training it's a good idea to select the setting for saving a Lora for each epoch, this way you can set it to lets say 5 epochs and can then do tests with xyz plot and choose the one that is giving you the best consistent results. The settings in schlongborns guide post is based on the generic YT guides by the likes of Sebastian Kamph and Aitrepeneur. It's fine for getting you going as a trial run but it's never going to give very good results. Instead go to the rentry thread I linked to and copy his settings and use as your base, then you need to adjust the settings according to your needs. It took me many trial runs before I got decent settings and consistent results. Don't shy away from a very long training time, it's better to have a somewhat mild or weak Lora then a completely overcooked one that you can't do shite with.

Here's my preset for my training. You will need to change the folder locations etc and also the settings for your own purposes but it's something to start with.

me3 · Sep 7, 2023

Sepheyer said:
I might change this later, this is my "first impressions".
---
Do HoneySelect2 LORAs make sense?

...

And finally, there is an issue of background contamination, where the checkpoint's background gets overwritten by the LORA's background. See for yourself, the greenery are the low-quality legacy of the HoneySelect2 renders:

You don't have permission to view the spoiler content. Log in or register now.

Let me know if you have any ideas how to fix this and restore the checkpoint's original backgrounds. The one solution I can think of is to avoid using such images in the training dataset, and instead ensure that the entire dataset has rather rare or abstract backgrounds.

if the background (or other elements) you don't want is bleeding into the lora it's generally a sign of needing to be "captioned out" in the training and/or you've "overtrained" in the sense that the AI has already learned the subject and it's including other things just because they are common, not captioned out and there's "room left to fill" in the lora. Simple thing might just be to say the images are in a forest or park

Mr-Fox · Sep 7, 2023

Penny wants to try out the new bed..

Sepheyer · Sep 7, 2023

Imma start on Lara Croft LORA: https://f95zone.to/threads/loras-for-wildeers-lara-croft-development-thread.173873/

[Stable Diffusion] Prompt Sharing and Learning Thread

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Member

Newbie

Well-Known Member

Newbie

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Well-Known Member