Before you grow too much into Automatic1111, do try ComfyUI - do what's called "portable install" by clicking "direct link to download":
You must be registered to see the links
.
The biggest selling point is that upscaling can be done as: latent -> latent -> latent rather than what A1111 offers of latent -> image -> latent -> image. I think this alone is ComfyUI's killer feature.
On top of it, every image created with CUI already contains the workflow that was used to create it. So, you can load this inside the CUI and immediately get the workflow that created the image:
Only you eventually know if you prefer apples or oranges, despite one crowd saying apples are it and another crowd saying oranges are it. Hence you kinda won't know until you try both.
Only you eventually know if you prefer apples or oranges, despite one crowd saying apples are it and another crowd saying oranges are it. Hence you kinda won't know until you try both.
Also, how can I use any picture and just convert it into an AI picture? Let's say I find a real picture I like and want SD to replicate it as close as possible just in the style of the chosen checkpoint or even lora, how do I do this?
image2image seems not to work, I always get weird results no matter the denoising setup. It's always adding stuff.
I understand that you are new to SD and excited to try all kinds of things. If MJ is windows and Dal e is Apple, then SD is Linux. It will take some time for you to figure out everything so try to focus on one thing until you have a good grasp on it before you explore new workflows and concepts etc. The token merging you saw in the override settings is something you inherited from my setup. It's for speeding up the generation iterations. You can set this in settings/optimization. The VAE needs to be placed in the folder I told you, models/VAE.
Then you can select the VAE in settings. Yes it's likely that you need to "apply restart" if something isn't working correctly.
You activate hiresfix and deactive it simply by pressing the drop down arrow of the hiresfix window. I find that most people doesn't really understand upscaling and how it works in SD. Hiresfix is more than simply upscaling it ads "hires steps", thus it creates new pixels that gives more detail. The same is true for the SD Upscale Extension in img2img tab. "Normal" upscaling in img2img and extra tab is without the SD Upscale extention. It only upscales and doesn't add new pixels, it only enlarge what is already there. The result is a larger image but with loss of sharpness and detail. My preferred workflow is that I work in the prompt and use random seed (-1) and when my prompt is done I generates images until I find a good seed with the result of a good image, then I set this seed as my static seed by pressing the green recycle button. Now it will display the specific seed. I will then activate hiresfix by pressing the drop down arrow and then generates my image again but this time with hiresfix activated. This is all I do most of the time. If you want to explore going further then you can upscale it further in img2img tab with SD Upscale extension. More is not always better. Even using SD Upscale extension can result in loss of detail if you already used hiresfix. It is diminishing returns. I would recommend to instead use photoshop to "upscale" the second time by going to image size and double it and make sure resampling is activated. This will result in a larger image with the composition and details intact. You will not gain details as with hiresfix or SD Upscale Extension but you will not lose any detail or composition, that can happen when you upscale a second time with SD. These are rule of thumb not cut in stone, it depends on case scenario
Extra tip.
When you wish to share prompt or generation data in this thread it's better to simply go to the output folders and upload the png file with intact generation data instead of using screenshots. If you wish to refer to a specific setting, yes then ofc it's good with a screenshot. With your png file I can simply load the data with PNG Info tab and help you much faster that way if the topic is trouble shooting prompts or settings.
Hang on, are you actually typing "public hair" - 'cos there's your problem!
If you are typing "pubic", try modifiers such as "(pubic hair:1.4)" and/or "pubes:1.5"
That being said, bushy pubes are a perennial blindspot with SD.
Try using MSPaint (or drawing program of your choice) to add a block of the desired colour in the approx size and position before then going to Inpaint and telling it to draw pubes where that block of colour is.
Tbh I don't bother with inpaint most of the time, I try to get it right from the start instead. But if you wish to use inpaint the seed is important, the denoising is important and the inpaint method, meaning if you use "fill" or original etc. Original uses the same latent noise as the original image and fill adds new. Most of the time it's best to use a random seed for the result but sometimes you want to use the same seed. Don't use cfg 30, it will not give a better result. Set it to 6-9, it's enough most of the time only rarely or specific scenarios is it any use to go higher and then I would recommend to not go higher than 16 ish. If you still don't get the result from the prompt then something else is at fault. Uplodad the png file with intact generation data and it will be easier for us to help you further with inpainting.
Gives SD something to modify - means it doesn't need to drag it out of nowhere. You get results closer to what you want. You don't have to be any good at drawing, just a triangle in the colour you want is usually enough. You can also get away with a lower denoising strength too, which reduces the risk of eldritch horrors.
To know and understand what all terms means that we might use or mention, explore this awesome glossery by no other than the eminent magnificent Jimwalrus.
After some try, and some months, i switched my train totally to kohya_ss.
I'm experimenting now and the power of lora training is really good, and faster. i reach about 2.8it/s on train, 3x fast than a1111 textual inversion.
By the way, i read all tutorial Mr-Fox suggest, and start experimenting.
So after some experiment i want to share some and discuss some training technique.
But first i show my process for training a shirt, this shirt:
5 image are only shirt with different color (for make lora more various)
2 image with subject wearing this shirt.
2) Caption
After many test, the caption text is really important and keep it similar for every image do better result, also use captioned text in promt when you are in sd give you resul you want.
in my case caption text is:
cloth1 blue shirt with a cat eating ramen on white background
and same text changing only color of shirt for other 4 image like this.
AND for image with subject:
cloth1 a man wearing black shirt with a cat eating ramen
cloth1 is a word i used to specify this shirt, seem useless but i need test more.
3) Error
My error is not cleaning original image from watermark, keep in mind for next time.
Why train a clothes? I decide to train clothes because reading around i see peoples say is really difficult,so for me start from difficult is a way to understand training better.
And what you see here is better lora i achieved, but before this i made really bad lora for test, sometimes it ignore my prompt, sometimes it do nothing. So after many experiment i found better setting that give me result i want.But doing it better is possible.
Just experimenting and grow up my training knowledge.
4) Conclusion and question
Train in standard v1.5 sd models is what many tutorial say, but it don't give me good result. I trained all my lora on my personal checkpoint, and i get better result. for what i've see pruned model and layer error model are not good for train.
Also about step and epoch: 100 step/ 1epoch seem enough an give me balanced result, 2epoch result overtrained.
50 step/2 epoch seem the same at 100 step/1epoch. (i'm in right?)
I tested this setting also on people:
Since A1111 was really the standard UI for all experimental work done with SD for "so long" (those long months, I mean, the whole tech is barely over a year old), it can still be the case that some cutting edge plugins and processes come out for A1111 first, but even that is starting to change since the public adoption of ComfyUI by the official SD team when SDXL came out.
But as someone who's rarely happy with the results one get from the original prompt generation, and always want to do a few more steps to get to what I'd call a final image, working with ComfyUI is just a dream come true. Comfy is faster, lighter, more versatile, and if your process requires taking each image and going through a dozen steps of improvements, detailing, upscaling, etc before you're finally happy with it, Comfy will allow you to set up all those steps ONCE, and then just generate good images every time, instead of manually moving through tabs for upscaling, inpainting, more upscaling, and on and on...
After some try, and some months, i switched my train totally to kohya_ss.
I'm experimenting now and the power of lora training is really good, and faster. i reach about 2.8it/s on train, 3x fast than a1111 textual inversion.
By the way, i read all tutorial Mr-Fox suggest, and start experimenting.
So after some experiment i want to share some and discuss some training technique.
But first i show my process for training a shirt, this shirt:
5 image are only shirt with different color (for make lora more various)
2 image with subject wearing this shirt.
2) Caption
After many test, the caption text is really important and keep it similar for every image do better result, also use captioned text in promt when you are in sd give you resul you want.
in my case caption text is:
cloth1 blue shirt with a cat eating ramen on white background
and same text changing only color of shirt for other 4 image like this.
AND for image with subject:
cloth1 a man wearing black shirt with a cat eating ramen
cloth1 is a word i used to specify this shirt, seem useless but i need test more.
Your captioning is wrong for the images that's wearing the tshirt.
You need to tell the AI that the person is wearing the clothing, not a tshirt. So you need to say that the person is wearing your trigger word.
It's probably much better to just use images of ppl wearing the clothing as well, since it's very unlikely that you'll use it in any other situation.
3) Error
My error is not cleaning original image from watermark, keep in mind for next time.
Why train a clothes? I decide to train clothes because reading around i see peoples say is really difficult,so for me start from difficult is a way to understand training better.
And what you see here is better lora i achieved, but before this i made really bad lora for test, sometimes it ignore my prompt, sometimes it do nothing. So after many experiment i found better setting that give me result i want.But doing it better is possible.
Just experimenting and grow up my training knowledge.
Clothing shouldn't be hard to train as it's just a matter of having good images that remains the same. The tricky part comes in if there's a TYPE of clothing you want to train that has alot of variation. In your case it would be very simple as you want it to learn just one simple tshirt.
What would be trickier is to train something like Xmas jumpers where you'd have multiple variations of patterns you'd want the AI to both learn and be able to create variations of by itself.
Train in standard v1.5 sd models is what many tutorial say, but it don't give me good result. I trained all my lora on my personal checkpoint, and i get better result. for what i've see pruned model and layer error model are not good for train.
Also about step and epoch: 100 step/ 1epoch seem enough an give me balanced result, 2epoch result overtrained.
50 step/2 epoch seem the same at 100 step/1epoch. (i'm in right?)
Either your lora or the model used for the images has a horrible issue with oversaturating. It's very obvious in the image of the girl, but it's pretty clear in the other ones as well. Given the images below as well it seems to be the model that's performing very badly.
And no, steps (as in image count x number of repeats) and epoch values aren't interchangeable in that way.
50 steps x 10 epochs is in no way the same training as 100 steps x 5 epochs
There's a lot of things that happen at the end of each epoch, this get "reset/restarted", things the "written" etc
Not sure how to best explain this....
The more you cycle through your images per epoch the more detail gets picked up, that "learned detail" gets used for further learning, more "finetuning" in the next epoch. To either correct wrongs or improve.
While it might not be 100% accurate, "loss" can be considered a representation of difference between what the AI expected and the actual value.
So lower loss suggest the AI predicted things closer too what happened.
So for most things you want the majority of your total step count to come from images x repeats and relatively low epoch count. Mostly you probably should need more than 5-10 epochs, exception generally being style.
Yes there is something wrong with your settings. I know this because I made exactly the same mistake when I started out as well
The "Inpaint Area" section needs to have 'whole picture' selected, not 'only masked'. I couldn't get my head round why it didn't work for a while but basically the prompt at the top of the page is applying to the whole picture and you have then told it that it just needs to apply to the masked section (in Mask Mode area). That's why you are getting a tiny picture-within-a-picture, because it is trying to create a picture of "purple haired girl with huge boobs sitting on the road flashing her vagina" in just the few pixels you have masked.
Actually what I find works quite well is to delete the prompt entirely and type in what you want to see in the masked area - in this case "shaved vagina" or something similar. You don't need the full prompt in inpaint, just the parts that apply to the section you are inpainting.
Oh man, I'd really love it if someone wants to make a simpsons sdxl lora. SDXL has some knowledge, but not enough. Croc has some nice high res work that can be used.
Yes there is something wrong with your settings. I know this because I made exactly the same mistake when I started out as well
The "Inpaint Area" section needs to have 'whole picture' selected, not 'only masked'. I couldn't get my head round why it didn't work for a while but basically the prompt at the top of the page is applying to the whole picture and you have then told it that it just needs to apply to the masked section (in Mask Mode area). That's why you are getting a tiny picture-within-a-picture, because it is trying to create a picture of "purple haired girl with huge boobs sitting on the road flashing her vagina" in just the few pixels you have masked.
Actually what I find works quite well is to delete the prompt entirely and type in what you want to see in the masked area - in this case "shaved vagina" or something similar. You don't need the full prompt in inpaint, just the parts that apply to the section you are inpainting.
As far as I know "whole picture" does change the whole picture, while "only masked" restricts the change to the chosen area. I tried to take all prompts away and only wrote a single prompt but it still never did what I wished..
As far as I know "whole picture" does change the whole picture, while "only masked" restricts the change to the chosen area. I tried to take all prompts away and only wrote a single prompt but it still never did what I wished..
Whole picture does change the entire picture, but as you have selected only to 'inpaint masked' then it only applies the change to the part that you have selected
You can either choose Doggettx or install xformers. i personally prefer xformers. The a1111 wiki isn't consistent with info on this so i'm not 100% sure, but if you want to install xformers you need to edit your webui-user.bat and add --xformers as shown in this screen
View attachment 2936666
,
it should install xformers automatically and then you can select xformers in cross attention optimization menu
And as i can see you have earlier version of torch. I have
View attachment 2936676
torch 2.0.1. But this topic i'll leave to someone that is more capable than me, i'm just a noob