Create and Fuck your AI Cum Slut -70% Summer Sale
x

Your noob to expert guide in making great AI art.

5.00 star(s) 1 Vote

wal01

New Member
Jan 24, 2019
7
5
Hello OP!
Are there some good FAQs you usefor features you didn't cover in tutorial? I totally miss ADetailer and several other things from WebUI and I wanna know is there smth similar in Invoke. Inpainting and upscaling is the very best thing in it but its a pain to inpaint several faces when you make a bunch of images and wanna save more than 1
 

NoTraceOfLuck

Member
Game Developer
Apr 20, 2018
310
517
Hello OP!
Are there some good FAQs you usefor features you didn't cover in tutorial? I totally miss ADetailer and several other things from WebUI and I wanna know is there smth similar in Invoke. Inpainting and upscaling is the very best thing in it but its a pain to inpaint several faces when you make a bunch of images and wanna save more than 1
There is no automatic ADetailer in Invoke like in WebUI. The way to get the same functionality is to 1: inpaint mask the face, and then 2: reduce the size of the bounding box (similar to what I did in my tutorial on the bounding box).

It require more steps, but you also get more flexibility on what exactly you want to upscale. If you want to upscale only the face, you can inpaint mask the face. Or if you want to upscale the entire character / entire body, you can inpaint mask the entire body. Or you can upscale things that aren't the character, such as background details.
 

osanaiko

Engaged Member
Modder
Jul 4, 2017
3,067
5,869
quick question, can I render 4k images? Invoke only lets me do 1536x1536.
Yes but you have to type the numbers in yourself, the sliders dont go that high
And there's a good reason why they are limited - the diffusion models are only trained on source images of a certain size. To create output with size greater than that the wrapper scripts need to make repeated separate calls and then combine the outputs. This often leads to weird artifacts at both the small scale along the seams, and also weirdness around the larger image composition.

A better way to get a 4096x4096 image is to create at 1024x1024 and then use a separate AI upscaler pass to increase the resolution while still having high detail. Not sure about Invoke but Automatic1111 and Comfy both have upscaling built in to the base functionality.
 

FemdomWifeGame

Active Member
Game Developer
Jan 24, 2021
893
2,281
Hello,

I wanted to thank you for this wonderful tutorial that got me started. In just a couple of days, I got all the info I needed.

As a thanks, I just wanted to repay with a bit of my knowledge. Since AI is only good if you can get consistent results, I almost immediately started toying with LoRas and creating custom characters.

So about creating your own LoRas and your own characters, here is my take.

There is a thing called Invoke training:


It's the counter-part of Invoke, it's dedicated to training your own LoRAs, to then be able to use them in Invoke.

I'm not going to walk everybody step by step through everything. This explanation is not enough, and you'll need to watch some basic tutorials about Invoke training to get the most of what I'm going to explain below.

The thing is easy to use, install Python on your computer, use the command line, and very quickly, you get your own Invoke training UI:

1750777589394.png

It requires a dataset, I think that is basically the most important part of all there is to know about creating LoRAs.

You'll need a set of images. You can generate them via Invoke. My tips is to heavily lean on ChatGPT to get a working prompt of your character, don't bother trying to deal with the tags yourself.

If you don't like the result, just iterate with ChatGPT until you have a positive and negative prompt that gets you a consistent character of your liking.

==== 1750778691735.png
Example of Invoke gallery generated via a ChatGPT prompt.

====

In your Invoke training UI, you'll need to:

- Give the path to your model (the one you actually use in Invoke).
- Give the path to your dataset config file (see below)
- Give some meta parameters for the model.

Now, when training a model, it will try to learn the key feature of your dataset, here our character, but it will need info about how to isolate that character from the noise (the noise being: the background, the clothes if the character can change outfit, the position, etc.).

To do that, you need to label your dataset accordingly. Here is a Python script I wrote that does that for you. You create folders whose names are the tags for the image in the folder, you fill those folders with image, change the path in the script, and run the script.
It will create a "dataset.jsonl" file, whose path your can past in the Invoke training UI.

====

1750778455131.png
Example of folder structure.

====

1750778563833.png
Example of folder content.

====

Python:
import os
import json

# Set the folder you want to scan
root_folder_path = "<yourpath>/invoke-training/datasets/<yourDatasetRootFolder>"  # Change this to your actual folder path
output_file = "dataset.jsonl"

file_entries = []

for root_folder_entry in os.listdir(root_folder_path):
    entry_path = os.path.join(root_folder_path, root_folder_entry)
    print(root_folder_entry);

    if os.path.isdir(entry_path):
        # List all files in the folder
        for filename in os.listdir(entry_path):
            file_path = os.path.join(root_folder_entry, entry_path, filename)
            if os.path.isfile(file_path) and filename.endswith(".png"):
                entry = {"filename": os.path.join(entry_path, filename), "text": "<yourCharacterTag>"}
                file_entries.append('{"image": "%s", "text": "oc_mia, %s", "mask": null}\n' % ((root_folder_entry + "/" + filename), root_folder_entry))


# Write to JSON
with open(root_folder_path + "/" + output_file, "w", encoding="utf-8") as f:
    for entry in file_entries:
        f.write(entry)

print(f"{len(file_entries)} files processed and written to {output_file}")
The script.
Change the path and the character tag: "<yourCharacterTag>". You'll use that tag as the keyword to identify your character in Invoke later.

You'll need about 20+ images, and you need those images to show your character in multiple setups. The reason is because if you have only 1 setup (only standing, only with a grey background, etc.) then that setup is not isolable from the character itself. We want to make sure that the only our key features are present in every images, in our case: the character face, the character colors and the character shape.

Everything else changes from image to image.
Everything else is captioned with tags (very important).

Execute the Python script.
Give the path to the created dataset.jsonl to your Invoke training UI.

About the meta-parameters:

- AdamW with a learning rate of 0.0005
People will bullshit about how the training rate is dependant of your dataset, how you need to test many values. In our case, we don't care, it's a LoRAs dataset of about 20-40 images, 0.0005 will do for you.

- The LoRAs rank dim, about 32, it's basically the size of the model. Too small and it won't abstract your character correctly, too large and it will overfit the data.
Same thing, don't bother looking for the perfect value, we've a consistent task: learn a character out of 20-40 images. 32 will do for you. You can lower it a bit if you want, it will work down to 16, but I prefer higher values.

- Number of epochs is not important, put 100, but create a checkpoint every epoch. You'll only use the latest, but by creating a checkpoint every epoch, you can stop the training whenever you feel like you're done.

- Don't bother with validation, put a high number for the number of epoch per validation, validation won't work on a LoRAs. I'll explain below why.

- Other parameters are irrelevant for us.

Once you're set, you can push the training button. The logs are in the console that you used to start the Invoke training UI (took me a while to notice it, lol).
Same thing, check the basic tutorials about it. Essentially it will create checkpoints in the "output" folder.

1750778945046.png

You can use those files as LoRAs in Invoke, as you would any other LoRAs.

Okay, now the big deal.
How do you actually use that LoRAs?

LoRAs is not going to learn your character perfectly. It's a mini-model that you trained locally. As a standalone, it's shit.

====
Example of image relying only on the LoRAs.
f3435587-55a0-474f-aac1-701c52f46ac6.png

So now you'll say: but that's not the initial character.
No it's not. LoRAs are dumb shits, and getting the right dataset requires long trial and errors. This checkpoint is also under-fitted, the longer you leave your model running, the higher chances you'll have to get a good checkpoint.

Now, for this post, I'll keep this checkpoint, since I didn't have time to finetune more. It's not a big deal.

You can adjust the character by adding some of the original tags, the major problems are usually the hair and eyes colors, since they don't take a lot of space in the image, the model loss on those is negligeable and it doesn't learn them well.

The colors are bad too, the contrast is low. That's because the model is averaging the color to minimize the overall error among all images. It's hard and long to do much better if you're a newbie, but the idea is that: it's not important.

We'll use the base model to fix it all of that for us.

====

So what's the secret to make it work? It's to layer your LoRAs into an already generated image.
The generated image with your full model will ground the quality.

So first:
  1. Generated a stand-alone environment image
  2. Using an additional raster painting, put a big spot of skin-color where you want your character to be
  3. Generate a character WITHOUT your LoRAs, but using similar tags to the one you used to generate the character, both negative prompt and positive prompt (girl, no extra limb, super res, whatever).

    You can create a regional hint twice at the same spot to create both a positive and a negative prompt. One regional hint can't have both a positive and negative prompt, I don't know why they did it that way.
    Technically you can create the character with the environment all-in-one, but I prefer to do it in two steps, especially if I want multiple characters. But that's really up to you in the end.
  4. Now you have a high quality setup with an environment and a character doing what you what, placed how you want. But it's not the correct character. Damn.
  5. So here is the trick, you put a paint mask on the character and a regional area with your character keyword. Don't redo the whole character at once! First do only the face, for example. Then only the clothes. Step by step will help you keep consistent art and quality.

    Each time you validate a step, keep the result as a raster layer.
    Important trick, put a very very low weight on your LoRas, like 0.2 and redo the generation several times. The generated part will tend toward your character, but the quality will mostly be the main model instead of your dumb shit LoRAs-level quality.
Here is an example of the full workflow:


Environment only, LoRAs is off.

1750782500083.png

Grounding for a character lying in the bed. LoRAs is still off.

There are:
1 mask to paint only on the bad.
1 positive prompt.
1 negative prompt.
1 raster layer for the body shape on the bed.

1750788918546.png

I tweaked a bit the prompt and I emptied the environment prompt, as it was creating too many tags for the model to handle the weights correctly. As a result, I just used the main prompt altogether, but regionals would have achieved the same result.

1750788966134.png

OK so now we have a positioned character, we just want to swap her with our LoRAS now. We'll reduce the mask to her face, and we'll put a regional hint on her head with our character tag.
I think the layer order is important, so put your new regional layer above the one describing the overall character setup.

I replayed a bit with the raster because the base model was too thin, because of my first stick drawing. Replaying with the new girl as the raster layer improves her over time.

1750789036452.png

Now we can enable our LoRAs (finally), with a weight of 0.2-0.7. We want the image to move toward your character, slowly but steadily, part by part.

This is a whole game of playing with the value of our LoRAs, as well as with the CFG scaling. To be honest, just try values, sometimes you'll want to boost one big time toward your characters, toward you'll want to just move a bit. It's all up to you. I did not need to play with the prompt itself, except for hair and eyes color. It's mostly about the LoRAs now.

1750789133982.png
1750789159901.png
1750789171572.png
1750789182606.png

Once we are there, we can move to the other body parts.

1750789226360.png
1750789235634.png

Legs:
1750789255073.png
1750789266060.png

You can reinforce the character by adding some of the original tags (relying on the base model more), and you can discard the bad generations, you don't have to keep them all.

Now the cleanup for the artefacts, we'll mask all the area around the character, while touching the character as little as possible, and redraw using the initial environment prompt.

1750789377594.png

1750789565954.png
Final result.

So, it's not perfect by any mean, firstly because I did those only for this post and I don't want to regenerate 100 times until it's perfect, and because I'm using an underfitted checkpoint.

Also, the character is small, but my training set didn't have "far away" examples, meaning that it doesn't know well what she is supposed to look like, but that can be improved by adding more examples in the dataset.

But the goal was to explain the workflow I developped, from OC idea to rendering the character into a concept, so here we are.

Going further, things I need to look first but I know can help:

- Learning the various noise generators. The one we usually use destroys the image completely and regenerate it almost from scratch, which is not always ideal for small adjustments. I believe some some generators may be less brutal and help preset more of the original parts into the new image.
- Learning about CFG Scale. Apparently it helps to force adherance to the prompt, or let the model do its best. This may help to force the initial character setting within an already existing environment.

edit:

I trained the checkpoint over night, so now, still applying exactly what I explained about the training of the LoRAs, here is the result of an Invoke using only the LoRAs:

1750834968215.png

So from here on, that character can safely be re-applied in actual invokes using the layered method explained above.
 
Last edited:

NoTraceOfLuck

Member
Game Developer
Apr 20, 2018
310
517
Hello,

I wanted to thank you for this wonderful tutorial that got me started. In just a couple of days, I got all the info I needed.

As a thanks, I just wanted to repay with a bit of my knowledge. Since AI is only good if you can get consistent results, I almost immediately started toying with LoRas and creating custom characters.

So about creating your own LoRas and your own characters, here is my take.

There is a thing called Invoke training:


It's the counter-part of Invoke, it's dedicated to training your own LoRAs, to then be able to use them in Invoke.

I'm not going to walk everybody step by step through everything. This explanation is not enough, and you'll need to watch some basic tutorials about Invoke training to get the most of what I'm going to explain below.

The thing is easy to use, install Python on your computer, use the command line, and very quickly, you get your own Invoke training UI:

View attachment 4975695

It requires a dataset, I think that is basically the most important part of all there is to know about creating LoRAs.

You'll need a set of images. You can generate them via Invoke. My tips is to heavily lean on ChatGPT to get a working prompt of your character, don't bother trying to deal with the tags yourself.

If you don't like the result, just iterate with ChatGPT until you have a positive and negative prompt that gets you a consistent character of your liking.

==== View attachment 4975778
Example of Invoke gallery generated via a ChatGPT prompt.

====

In your Invoke training UI, you'll need to:

- Give the path to your model (the one you actually use in Invoke).
- Give the path to your dataset config file (see below)
- Give some meta parameters for the model.

Now, when training a model, it will try to learn the key feature of your dataset, here our character, but it will need info about how to isolate that character from the noise (the noise being: the background, the clothes if the character can change outfit, the position, etc.).

To do that, you need to label your dataset accordingly. Here is a Python script I wrote that does that for you. You create folders whose names are the tags for the image in the folder, you fill those folders with image, change the path in the script, and run the script.
It will create a "dataset.jsonl" file, whose path your can past in the Invoke training UI.

====

View attachment 4975755
Example of folder structure.

====

View attachment 4975761
Example of folder content.

====

Python:
import os
import json

# Set the folder you want to scan
root_folder_path = "<yourpath>/invoke-training/datasets/<yourDatasetRootFolder>"  # Change this to your actual folder path
output_file = "dataset.jsonl"

file_entries = []

for root_folder_entry in os.listdir(root_folder_path):
    entry_path = os.path.join(root_folder_path, root_folder_entry)
    print(root_folder_entry);

    if os.path.isdir(entry_path):
        # List all files in the folder
        for filename in os.listdir(entry_path):
            file_path = os.path.join(root_folder_entry, entry_path, filename)
            if os.path.isfile(file_path) and filename.endswith(".png"):
                entry = {"filename": os.path.join(entry_path, filename), "text": "<yourCharacterTag>"}
                file_entries.append('{"image": "%s", "text": "oc_mia, %s", "mask": null}\n' % ((root_folder_entry + "/" + filename), root_folder_entry))


# Write to JSON
with open(root_folder_path + "/" + output_file, "w", encoding="utf-8") as f:
    for entry in file_entries:
        f.write(entry)

print(f"{len(file_entries)} files processed and written to {output_file}")
The script.
Change the path and the character tag: "<yourCharacterTag>". You'll use that tag as the keyword to identify your character in Invoke later.

You'll need about 20+ images, and you need those images to show your character in multiple setups. The reason is because if you have only 1 setup (only standing, only with a grey background, etc.) then that setup is not isolable from the character itself. We want to make sure that the only our key features are present in every images, in our case: the character face, the character colors and the character shape.

Everything else changes from image to image.
Everything else is captioned with tags (very important).

Execute the Python script.
Give the path to the created dataset.jsonl to your Invoke training UI.

About the meta-parameters:

- AdamW with a learning rate of 0.0005
People will bullshit about how the training rate is dependant of your dataset, how you need to test many values. In our case, we don't care, it's a LoRAs dataset of about 20-40 images, 0.0005 will do for you.

- The LoRAs rank dim, about 32, it's basically the size of the model. Too small and it won't abstract your character correctly, too large and it will overfit the data.
Same thing, don't bother looking for the perfect value, we've a consistent task: learn a character out of 20-40 images. 32 will do for you. You can lower it a bit if you want, it will work down to 16, but I prefer higher values.

- Number of epochs is not important, put 100, but create a checkpoint every epoch. You'll only use the latest, but by creating a checkpoint every epoch, you can stop the training whenever you feel like you're done.

- Don't bother with validation, put a high number for the number of epoch per validation, validation won't work on a LoRAs. I'll explain below why.

- Other parameters are irrelevant for us.

Once you're set, you can push the training button. The logs are in the console that you used to start the Invoke training UI (took me a while to notice it, lol).
Same thing, check the basic tutorials about it. Essentially it will create checkpoints in the "output" folder.

View attachment 4975790

You can use those files as LoRAs in Invoke, as you would any other LoRAs.

Okay, now the big deal.
How do you actually use that LoRAs?

LoRAs is not going to learn your character perfectly. It's a mini-model that you trained locally. As a standalone, it's shit.

====
Example of image relying only on the LoRAs.
View attachment 4975996

So now you'll say: but that's not the initial character.
No it's not. LoRAs are dumb shits, and getting the right dataset requires long trial and errors. This checkpoint is also under-fitted, the longer you leave your model running, the higher chances you'll have to get a good checkpoint.

Now, for this post, I'll keep this checkpoint, since I didn't have time to finetune more. It's not a big deal.

You can adjust the character by adding some of the original tags, the major problems are usually the hair and eyes colors, since they don't take a lot of space in the image, the model loss on those is negligeable and it doesn't learn them well.

The colors are bad too, the contrast is low. That's because the model is averaging the color to minimize the overall error among all images. It's hard and long to do much better if you're a newbie, but the idea is that: it's not important.

We'll use the base model to fix it all of that for us.

====

So what's the secret to make it work? It's to layer your LoRAs into an already generated image.
The generated image with your full model will ground the quality.

So first:
  1. Generated a stand-alone environment image
  2. Using an additional raster painting, put a big spot of skin-color where you want your character to be
  3. Generate a character WITHOUT your LoRAs, but using similar tags to the one you used to generate the character, both negative prompt and positive prompt (girl, no extra limb, super res, whatever).

    You can create a regional hint twice at the same spot to create both a positive and a negative prompt. One regional hint can't have both a positive and negative prompt, I don't know why they did it that way.
    Technically you can create the character with the environment all-in-one, but I prefer to do it in two steps, especially if I want multiple characters. But that's really up to you in the end.
  4. Now you have a high quality setup with an environment and a character doing what you what, placed how you want. But it's not the correct character. Damn.
  5. So here is the trick, you put a paint mask on the character and a regional area with your character keyword. Don't redo the whole character at once! First do only the face, for example. Then only the clothes. Step by step will help you keep consistent art and quality.

    Each time you validate a step, keep the result as a raster layer.
    Important trick, put a very very low weight on your LoRas, like 0.2 and redo the generation several times. The generated part will tend toward your character, but the quality will mostly be the main model instead of your dumb shit LoRAs-level quality.
Here is an example of the full workflow:


Environment only, LoRAs is off.

View attachment 4976016

Grounding for a character lying in the bed. LoRAs is still off.

There are:
1 mask to paint only on the bad.
1 positive prompt.
1 negative prompt.
1 raster layer for the body shape on the bed.

View attachment 4976352

I tweaked a bit the prompt and I emptied the environment prompt, as it was creating too many tags for the model to handle the weights correctly. As a result, I just used the main prompt altogether, but regionals would have achieved the same result.

View attachment 4976354

OK so now we have a positioned character, we just want to swap her with our LoRAS now. We'll reduce the mask to her face, and we'll put a regional hint on her head with our character tag.
I think the layer order is important, so put your new regional layer above the one describing the overall character setup.

I replayed a bit with the raster because the base model was too thin, because of my first stick drawing. Replaying with the new girl as the raster layer improves her over time.

View attachment 4976358

Now we can enable our LoRAs (finally), with a weight of 0.2-0.7. We want the image to move toward your character, slowly but steadily, part by part.

This is a whole game of playing with the value of our LoRAs, as well as with the CFG scaling. To be honest, just try values, sometimes you'll want to boost one big time toward your characters, toward you'll want to just move a bit. It's all up to you. I did not need to play with the prompt itself, except for hair and eyes color. It's mostly about the LoRAs now.

View attachment 4976361
View attachment 4976364
View attachment 4976365
View attachment 4976366

Once we are there, we can move to the other body parts.

View attachment 4976369
View attachment 4976370

Legs:
View attachment 4976371
View attachment 4976372

You can reinforce the character by adding some of the original tags (relying on the base model more), and you can discard the bad generations, you don't have to keep them all.

Now the cleanup for the artefacts, we'll mask all the area around the character, while touching the character as little as possible, and redraw using the initial environment prompt.

View attachment 4976375

View attachment 4976381
Final result.

So, it's not perfect by any mean, firstly because I did those only for this post and I don't want to regenerate 100 times until it's perfect, and because I'm using an underfitted checkpoint.

Also, the character is small, but my training set didn't have "far away" examples, meaning that it doesn't know well what she is supposed to look like, but that can be improved by adding more examples in the dataset.

But the goal was to explain the workflow I developped, from OC idea to rendering the character into a concept, so here we are.

Going further, things I need to look first but I know can help:

- Learning the various noise generators. The one we usually use destroys the image completely and regenerate it almost from scratch, which is not always ideal for small adjustments. I believe some some generators may be less brutal and help preset more of the original parts into the new image.
- Learning about CFG Scale. Apparently it helps to force adherance to the prompt, or let the model do its best. This may help to force the initial character setting within an already existing environment.
Great post!

Something you might want to try is resizing the bounding box so that it only covers the area around the character. This helps quite a bit to improve the quality of "far away" characters.

I had a similar scene. This is what my image looked like without and with changing the bounding box size. Lowering the size gives you some extra detail on the character:


Original image ------------------------------------------------------------------------- Bounding Box Adjusted
1750793921211.png 1750793960190.png

The details on how to do this are in this part of my original post: https://f95zone.to/threads/your-noob-to-expert-guide-in-making-great-ai-art.256631/post-17104572
 

FemdomWifeGame

Active Member
Game Developer
Jan 24, 2021
893
2,281
Great post!

Something you might want to try is resizing the bounding box so that it only covers the area around the character. This helps quite a bit to improve the quality of "far away" characters.

I had a similar scene. This is what my image looked like without and with changing the bounding box size. Lowering the size gives you some extra detail on the character:


Original image ------------------------------------------------------------------------- Bounding Box Adjusted
View attachment 4976578 View attachment 4976580

The details on how to do this are in this part of my original post: https://f95zone.to/threads/your-noob-to-expert-guide-in-making-great-ai-art.256631/post-17104572
Ah, neat trick. I didn't go that far, basically for each part of your post, I ended up trying by myself. So when I reached the LoRAs, well... It went a bit rabbit hole from that point onward haha.

I need to read the rest now :p
 
5.00 star(s) 1 Vote