• To improve security, we will soon start forcing password resets for any account that uses a weak password on the next login. If you have a weak password or a defunct email, please update it now to prevent future disruption.

AI LORAs for Wildeer's Lara Croft (Development Thread)

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,791
So, using WD1.4 for tagging. There's a threshold setting for items and for characters.

Let's see how it treats this photo using different settings of General threshold (Adjust `general_threshold` for pruning tags (less tags, less flexible)) and Character threshold (useful if you want to train with character).

View attachment 2919654

SmilingWolf/wd-v1-4-convnextv2-tagger-v2

General thresholdCharacter thresholdTag
0.001.00Too much to be useful: see file 00-10.txt
0.250.75WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, simple background, brown hair, black hair, gloves, bare shoulders, brown eyes, ponytail, ass, parted lips, black gloves, looking back, fingerless gloves, from behind, nail polish, leotard, lips, shiny skin, thigh strap, blue background, bent over, realistic, kneepits, hand on own ass, mole on ass
0.500.50WildeerLaraCroft, 1girl, solo, long hair, looking at viewer, brown hair, gloves, brown eyes, ponytail, ass, parted lips, looking back, fingerless gloves, blue background, bent over, realistic
0.750.25WildeerLaraCroft, 1girl, solo, long hair, looking at viewer, brown hair, brown eyes, ass, fingerless gloves
1.000.00Too much to be useful: see file 10-00.txt
---------
0.050.05WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, blush, smile, open mouth, bangs, large breasts, simple background, brown hair, shirt, black hair, gloves, bare shoulders, brown eyes, medium breasts, underwear, standing, panties, ponytail, ass, thighs, parted lips, teeth, sleeveless, black gloves, shiny, looking back, artist name, fingerless gloves, from behind, nail polish, mole, leotard, lips, fingernails, head tilt, gradient, legs, one-piece swimsuit, shiny skin, parted bangs, bare arms, gradient background, tattoo, thigh strap, leaning forward, anus, feet out of frame, cameltoe, watermark, blue background, bent over, tan, blue nails, tanlines, thong, ass grab, realistic, nose, ass focus, partially visible vulva, kneepits, blue leotard, hand on own ass, grabbing own ass, spread ass, anus peek, spanked, mole on ass, slap mark, hands on ass, hands on own ass
0.100.10WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, bangs, simple background, brown hair, black hair, gloves, bare shoulders, brown eyes, medium breasts, underwear, standing, panties, ponytail, ass, thighs, parted lips, sleeveless, black gloves, shiny, looking back, fingerless gloves, from behind, nail polish, leotard, lips, fingernails, head tilt, shiny skin, tattoo, thigh strap, leaning forward, feet out of frame, cameltoe, blue background, bent over, tan, ass grab, realistic, nose, ass focus, kneepits, hand on own ass, grabbing own ass, spanked, mole on ass, slap mark
0.900.90WildeerLaraCroft, 1girl, solo, ass
0.950.95WildeerLaraCroft, 1girl, solo

So, around 0.10 the tagging is sensitive enough to pick up the slap mark and the leotard. Although the setting has to be at 0.05 to tell the color of the leotard apart.

PS. Naturally, this is not definitive, just a directional test.
Interesting stuff. I'm not sure I get it completely. If I got it right, these are settings that affects how the image is interrogated and what is picked up from the image. Or did I missunderstand completely?

Googled it and found some info on this colab page:

There are other models also for image tagging. Did you try any other?
 
Last edited:
  • Like
Reactions: Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,519
3,581
Interesting stuff. I'm not sure I get it completely. If I got it right, these are settings that affects how the image is interrogated and what is picked up from the image. Or did I missunderstand completely?

Googled it and found some info on this colab page:

There are other models also for image tagging. Did you try any other?
I just tried another WD1.4 captioning model "SmilingWolf/wd-v1-4-swinv2-tagger-v2" and it is indestinguishable from this.

General thresholdCharacter thresholdTag
0.050.05WildeerLaraCroft, 1girl, solo, long hair, breasts, looking at viewer, blush, smile, open mouth, bangs, large breasts, simple background, brown hair, shirt, black hair, gloves, bare shoulders, brown eyes, medium breasts, underwear, standing, panties, ponytail, ass, thighs, parted lips, teeth, sleeveless, black gloves, shiny, looking back, artist name, fingerless gloves, from behind, nail polish, mole, leotard, lips, fingernails, head tilt, gradient, legs, one-piece swimsuit, shiny skin, parted bangs, bare arms, gradient background, tattoo, thigh strap, leaning forward, anus, feet out of frame, cameltoe, watermark, blue background, bent over, tan, blue nails, tanlines, thong, ass grab, realistic, nose, ass focus, partially visible vulva, kneepits, blue leotard, hand on own ass, grabbing own ass, spread ass, anus peek, spanked, mole on ass, slap mark, hands on ass, hands on own ass
0.900.90WildeerLaraCroft, 1girl, solo, ass

The 0.05/0.05 settings gave me pretty much the same group of tokens.
And 0.90/0.90 gave the exact match.
 
  • Red Heart
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,519
3,581
So with LORA version 1.0 being a failure I decided to start a new LORA training with a tighter image set. The v1 image set was ~150 images, and for v2 I decided to go with exactly 20, because this somewhat worked in the past on two character LORAs I trained. First I eliminated all images where the chara's face wasn't looking directly into the camera, that left me with ~80 images, then I pruned another set of images where the chara was laying down or had her back turned towards the camera or wtf, then I pruned what I thought was bad quality (all of them kinda are when downscaled to 512x512 tbh), and got left with these twenty images:

imageset.png
For the text files I ran WD1.4 at 0.10 getting this for the first image, as an example:
You don't have permission to view the spoiler content. Log in or register now.
Hmm, I never reviewed the captioning, so I am only now realizing it had questionable tokens such as: web address, inverted nipples, lip biting. On the next run I'll need to manually get rid of these.

I trained using Photogen 3.4, Kohya settings attached in the 002.zip file.

The 512x512 images are rather trashy, compared to v1:
You don't have permission to view the spoiler content. Log in or register now.
but 1152x1728 images show promise. The Wildeer's Lara likeness has rather disappeared, but the images are great. Its gonna take me a while to generate a bunch of these tho, so I am posting the second one generated:

a_16877_.png

Imma post this new (version 2) LORA to Civitail once I generated a few more images for the gallery - so, might take up to 24 hours.
 
  • Like
Reactions: Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,791
This is good work. It takes many tries while learning before you get a good Lora. A lot of patience is needed. Don't be too hasty to call it a day.. I have made several posts about the importance of editing the captions manually... Get rid of tags such as "watermark" or other things that you don't want. Of course you need to make sure there isn't any watermarks to begin with. For the image quality, in Photoshop after resizing use smart sharpen to bring back details and fidelity, use camera raw filter to improve light and colors etc.
 
  • Like
Reactions: Sepheyer

me3

Member
Dec 31, 2016
316
708
With regards to choosing models, it's recommended to avoid pruned versions, so if possible download the full version if it exists.
Regarding picking images, too much eye makup and/or shadowing can cause a lot of issues so that's worth keeping in mind
 
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,519
3,581
Next version is up on Civitai: . "Official name" is 1.0.2.

Imma train the next bunch of LORA with the same settings but using Zovya's Photoreal v2 - that's might be what caused the v1 to look soo Wildeer. We'll know in a few days.

a_16904_.png
You don't have permission to view the spoiler content. Log in or register now.
It just dawned on me that Lara in a red leotard is so fucking Natasha from the Red Alert.
 
  • Love
Reactions: Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,791
Next version is up on Civitai: . "Official name" is 1.0.2.

Imma train the next bunch of LORA with the same settings but using Zovya's Photoreal v2 - that's might be what caused the v1 to look soo Wildeer. We'll know in a few days.

View attachment 2921917
You don't have permission to view the spoiler content. Log in or register now.
It just dawned on me that Lara in a red leotard is so fucking Natasha from the Red Alert.
I wouldn't mind fucking Natasha.. badun tss.. :sneaky::D

I made something also.

Classic Tomb Raider Angelina.png
 
  • Red Heart
Reactions: Sepheyer

me3

Member
Dec 31, 2016
316
708
Since my knowledge of how this are "meant" to look and what specific details we're looking for, I only really have the images to go by and they are quite varied in features.
Which, if any, of these faces are close to the goal. Going by the training images the lower jaw seems too "strong"/wide, lips seem too large and the face as a whole is a bit too "round".
Columns are the lora weights, rows are a selection of epochs, ignore the blue hair as that's just there to check the flexibility as the final epoch was at 10200 steps so i wanted to see if it was completely burned.
grid.jpg
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,519
3,581
Since my knowledge of how this are "meant" to look and what specific details we're looking for, I only really have the images to go by and they are quite varied in features.
Which, if any, of these faces are close to the goal. Going by the training images the lower jaw seems too "strong"/wide, lips seem too large and the face as a whole is a bit too "round".
Columns are the lora weights, rows are a selection of epochs, ignore the blue hair as that's just there to check the flexibility as the final epoch was at 10200 steps so i wanted to see if it was completely burned.
View attachment 2926798
The 000030-1.0:1.0 nails it.
 

felldude

Member
Aug 26, 2017
467
1,430
Don't know if its been pointed out but the trigger: WildeerLaraCroft
will blend a lara with deer antlers esp at high values such as (WildeerLaraCroft:1.4)

This is do to the text encoder having a form of auto spell correct, as I spell background, wrong most of the time like back+round
(The correction list is longer then the word list in some cases)

You could be training the text encoder against a deer lara blend with that trigger.

Also if the character tagger detects Lara croft by itself without prompting you will likely be training the U-net against images of Lara croft even if you don't caption or train the text encoder.
 
Last edited:
  • Like
  • Haha
Reactions: Mr-Fox and Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,519
3,581
Don't know if its been pointed out but the trigger: WildeerLaraCroft
will blend a lara with deer antlers esp at high values such as (WildeerLaraCroft:1.4)

This is do to the text encoder having a form of auto spell correct, as I spell background, wrong most of the time like back+round
(The correction list is longer then the word list in some cases)

You could be training the text encoder against a deer lara blend with that trigger.

Also if the character tagger detects Lara croft by itself without prompting you will likely be training the U-net against images of Lara croft even if you don't caption or train the text encoder.
This post shows in great detail how 1.0.2 LORA was created: link.
 

Sepheyer

Well-Known Member
Dec 21, 2020
1,519
3,581
The model Clarity 2 seems to work relly well with this 1.0.2 LORA. I keep running thru the list of the models I have, and for this LORA I started to frequent Clarity 2.

Here LORA at 0.5/0.5 strength.

a_17095_.png
You don't have permission to view the spoiler content. Log in or register now.
 
Last edited:

felldude

Member
Aug 26, 2017
467
1,430
This post shows in great detail how 1.0.2 LORA was created: link.
I read it and the post before, I'm not sure if you left it in the captions or not but I can say for sure across a batch of 50 images that in 80% of them antlers or something antler like came across.

It was especially noticeable in a custom merge I use but I tested in juggernaut and a few others as well.

ComfyUI_00003_.png

It's possible that if you use that trigger across all images you will successfully train the text encoder not to auto correct to wild deer however for a fact it is doing that without a lora
 
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
The 000030-1.0:1.0 nails it.
unfortunately that seems to be a massive case of seed and prompt hitting a very very limited sweetspot, generating A LOT of images with the same conditions it doesn't provide the same face.

It's rather annoying because through out the whole training there's elements that got picked up extremely well.
Like the thigh tattoo you can clearly see there are symbols in the top band and it's not being mistaken for a strap/belt or some sort of fishnet stocking which the taggers like to call it. Same with her back/side tattoo, you clearly see it's some kind of text.

Generating hundreds of face images across all the epochs the "wrong" facial features it picks up show up fairly early on and it remains consistent through out the training and it's pretty much in all of the different trainings. It's not from the images as it's not a feature that is represented there, and it's not due to captioning as i've tried multiple variations including no captions at all.

I'm generating images on the model used for training with just the trigger (not using the loras obviously) and it seems to be producing a consistent face, which means it exists in the model already. As i'm writing i'm running to check if it's the trigger fully or if it's a case of partial matching/fixing.
Generally when i've been doing "non just screwing around" trainings i've checked if the choice of triggers have been "safe", but since i've yet to have that happen i didn't bother this time around.
It seems my trigger word gets treated as a "match" to laracroft, but NOT lara croft, and looking back at some of the earlier grids the first "barely learn anything yet" epoch image has a similarity. So it's seems like the training is being contaminated by this...

Just since i were already running this kind of tests, i did one for WildeerLaraCroft as well....and seems to give fairly close matching images to LaraCroft.

Just noticed the post about merging deer/antlers, so far i've not seen that happen, you get "mangled" face etc but you do that with most things when drastic weight is involved. So this issue could be down to the prompt interpreter, of which there are many, or some kind of extension relating to how prompt is being parsed etc
 
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,519
3,581
I read it and the post before, I'm not sure if you left it in the captions or not but I can say for sure across a batch of 50 images that in 80% of them antlers or something antler like came across.

It was especially noticeable in a custom merge I use but I tested in juggernaut and a few others as well.

View attachment 2926994

It's possible that if you use that trigger across all images you will successfully train the text encoder not to auto correct to wild deer however for a fact it is doing that without a lora
Good guess about Wildeer > Deer by the text encoder. But, I mean for me that LORA starts breaking down at around 0.9 strength and you have it at 1.4 which is wow.

But I know so little how LORAs work and other than what I offered in the post about how it was made, I don't have the skills to troubleshoot, your guess is as good as mine.
 
  • Like
Reactions: Mr-Fox

felldude

Member
Aug 26, 2017
467
1,430
Good guess about Wildeer > Deer by the text encoder. But, I mean for me that LORA starts breaking down at around 0.9 strength and you have it at 1.4 which is wow.

But I know so little how LORAs work and other than what I offered in the post about how it was made, I don't have the skills to troubleshoot, your guess is as good as mine.
I can show you your text encoder training at work
(Realistic Vision with Seed 1)

Wildeer with no lora

ComfyUI_00032_.png

Wildeer with 1.0 Clip Only no model (Lara Croft Wildeer 1.0.2)


ComfyUI_00034_.png

Wildeer with 1.0 Lora Clip and Model (Lara Croft Wildeer 1.0.2)


ComfyUI_00033_.png

The 1.4 was not with your lora, it was testing the phrase without a lora to see if the text encoder would do anything with it.
For Stable Web UI uses think (((Wildeer)))
I may be off on the number of () but you get the idea

That is why I allways add FD before any of lora trigger words that I use and also test them just in case.
Such as FDGlory instead of just Glory


Taking the term (WildeerLaraCroft:.5) with no Lora

Seed 1 will generate this image

ComfyUI_00040_.png
Your lora is training against that image if the trigger word is used.

And with the lora at value model and clip both at 1 we will get

ComfyUI_00041_.png
 
Last edited:
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,519
3,581
I can show you your text encoder training at work
(Realistic Vision with Seed 1)

Wildeer with no lora

View attachment 2927077

Wildeer with 1.0 Clip Only

View attachment 2927079

Wildeer with 1.0 Lora Clip and Model

View attachment 2927083

The 1.4 was not with your lora, it was testing the phrase without a lora to see if the text encoder would do anything with it.
For Stable Web UI uses think (((Wildeer)))
I may be off on the number of () but you get the idea

That is why I allways add FD before any of lora trigger words that I use and also test them just in case.
Such as FDGlory instead of just Glory
Ohh, we are probably talking about two different contexts here and they do not match. By default I thought you meant why this thread's LORA gives antlers at 1.4 strength. But it was FDMaya-Morena LORA that generated that picture here and it wasn't this thread's LORA. Then I really don't know, I'd think the Maya-Morena's creator could shed light, but I don't really know.
 
  • Like
Reactions: Mr-Fox

felldude

Member
Aug 26, 2017
467
1,430
Just noticed the post about merging deer/antlers, so far i've not seen that happen, you get "mangled" face etc but you do that with most things when drastic weight is involved. So this issue could be down to the prompt interpreter, of which there are many, or some kind of extension relating to how prompt is being parsed etc
I agree about the mangled face but it was done at that high of a value to show what the model was blending.
Here is the same seed same prompt at a lower value in Realistic vision

WildeerLaraCroft:1.1

ComfyUI_00046_.png

LaraCroft:1.1

ComfyUI_00047_.png

Both produce an image of lara croft without any lora but if you run a batch of images with the WildeerLaraCroft they will almost all have strange artifacts.

Ohh, we are probably talking about two different contexts here and they do not match. By default I thought you meant why this thread's LORA gives antlers at 1.4 strength. But it was FDMaya-Morena LORA that generated that picture here and it wasn't this thread's LORA. Then I really don't know, I'd think the Maya-Morena's creator could shed light, but I don't really know.
That lora was connected but not active. The value was 0 and 0
I posted a whole breakdown using your lora above.



I'm the creator and I wasn't satisfied with the results
You don't have permission to view the spoiler content. Log in or register now.
 
Last edited:
  • Like
Reactions: Mr-Fox