I have made a few TIs (downloadable from
You must be registered to see the links
) - some people run them high learning rate to start with, then turn it down massively as it starts to produce good results. Frankly they always look a bit off or rushed to me.
I have a fairly standard 'recipe' for a photorealistic-type TI:
- At least 60 images. Good ones. Varied poses, clothes and, especially, expressions. Try to get a face in profile or two. Some full body, some close-up faces. Curate them well and ensure you manually crop and resize them to 512x512. Don't use automated tools, they're all shit! MS Paint is fine. Ideally, have over 100 images. Don't upscale shit photos, go looking for a better quality original instead. All this takes time. It matters though. Don't bake a cake with mouldy flour and rotten eggs.
- Some guides recommend the mirroring option to double the number of images. This is fine if you're training on an object, but how many people have a truly symmetrical face? You'll just make them look a bit weird. Don't use this option.
- Use a completely unambiguous name and token for your embedding e.g. for Kylie Minogue I used "KYL13M". That way SD won't try to use its default Kylie training which looks nothing like her. No chance of any influence.
- A learning rate of 0.0015. I don't vary it, seems OK for me. It's pretty low so you shouldn't overtrain. Overtraining is mostly an issue for the impatient who try to rush it.
- 12-20 Vectors per token. No less than 12 for a person, ensure you have at least 5 images per vector. Don't bother with more than 20. I once tried 40, the law of diminishing returns kicked in.
- Use SD1.5 as the base model for training to give optimum applicability across other models.
- Set 'subject' instead of 'style'.
- Set a good prompt in the txt2img tab - one you would probably use when generating the finished product. Avoid weightings though, keep it all unweighted. Avoid LoRAs or other embeddings too if you can. Set it to preview (and save an embedding) every 500 or 1000 steps. Give it a fixed seed so you're always generating previews on the same prompt and seed.
- Set the steps limit to 30,000.
- Hit 'Train embedding', open the preview images folder and find something else to do for several hours(!)
- Your previews will be terrible, SD1.5 is actually an awful model for photorealistic people. You should still be able to find the point where it really starts to settle down though, usually around 17k steps or more.
- Copy the relevant embeddings (the .pt files, suffixed "-n" where n is the no of steps) for those previews that seem best into the main \embeddings folder.
- Set up scripting X/Y/Z plot with prompt S/R to switch from KYL13M-18000 to KYL13M-19000 to KYL13M-20000 etc. Run a quick test in your model of choice NOT SD1.5 itself. A couple of images for each draft embedding on test, with a fixed seed so it reuses the same couple of seeds each time.
- Compare the results to see which is best. Remember, if you've spent long enough curating the images, the not-Artificial Intelligence in your brain will also be trained on the subject! You should inherebtly know which are closest to the real deal.
- Run some more test images to check. Try a few really wacky prompts (including nudity) to ensure you've not baked in clothes, poses, expressions or background items.
- Run some good SFW images off as previews for Civitai and upload it there for everyone to download.