Your noob to expert guide in making great AI art.

Ochelo13 · Jun 7, 2025

could you please give us a link for other good loras

NoTraceOfLuck · Jun 7, 2025

Ochelo13 said:
could you please give us a link for other good loras

The most popular website to find LoRAs is civitai

You must be registered to see the links

Select the options "LoRA", "LyCORIS", and "DoRA"

And remember to select the base model you're using

wal01 · Jun 8, 2025

Hello OP!
Are there some good FAQs you usefor features you didn't cover in tutorial? I totally miss ADetailer and several other things from WebUI and I wanna know is there smth similar in Invoke. Inpainting and upscaling is the very best thing in it but its a pain to inpaint several faces when you make a bunch of images and wanna save more than 1

NoTraceOfLuck · Jun 8, 2025

wal01 said:
Hello OP!
Are there some good FAQs you usefor features you didn't cover in tutorial? I totally miss ADetailer and several other things from WebUI and I wanna know is there smth similar in Invoke. Inpainting and upscaling is the very best thing in it but its a pain to inpaint several faces when you make a bunch of images and wanna save more than 1

There is no automatic ADetailer in Invoke like in WebUI. The way to get the same functionality is to 1: inpaint mask the face, and then 2: reduce the size of the bounding box (similar to what I did in my tutorial on the bounding box).

It require more steps, but you also get more flexibility on what exactly you want to upscale. If you want to upscale only the face, you can inpaint mask the face. Or if you want to upscale the entire character / entire body, you can inpaint mask the entire body. Or you can upscale things that aren't the character, such as background details.

fallofrome32 · Jun 9, 2025

Thanks for informing me about Invoke, it is so much better compared to other tools.

Also, this guide is excellent thanks for sharing.

fallofrome32 · Jun 9, 2025

quick question, can I render 4k images? Invoke only lets me do 1536x1536.

NoTraceOfLuck · Jun 9, 2025

zupamupa10 said:
quick question, can I render 4k images? Invoke only lets me do 1536x1536.

View attachment 4925141

Yes but you have to type the numbers in yourself, the sliders dont go that high

osanaiko · Jun 10, 2025

zupamupa10 said:
quick question, can I render 4k images? Invoke only lets me do 1536x1536.

NoTraceOfLuck said:
Yes but you have to type the numbers in yourself, the sliders dont go that high

And there's a good reason why they are limited - the diffusion models are only trained on source images of a certain size. To create output with size greater than that the wrapper scripts need to make repeated separate calls and then combine the outputs. This often leads to weird artifacts at both the small scale along the seams, and also weirdness around the larger image composition.

A better way to get a 4096x4096 image is to create at 1024x1024 and then use a separate AI upscaler pass to increase the resolution while still having high detail. Not sure about Invoke but Automatic1111 and Comfy both have upscaling built in to the base functionality.

FemdomWifeGame · Jun 24, 2025

Hello,

I wanted to thank you for this wonderful tutorial that got me started. In just a couple of days, I got all the info I needed.

As a thanks, I just wanted to repay with a bit of my knowledge. Since AI is only good if you can get consistent results, I almost immediately started toying with LoRas and creating custom characters.

So about creating your own LoRas and your own characters, here is my take.

There is a thing called Invoke training:

You must be registered to see the links

It's the counter-part of Invoke, it's dedicated to training your own LoRAs, to then be able to use them in Invoke.

I'm not going to walk everybody step by step through everything. This explanation is not enough, and you'll need to watch some basic tutorials about Invoke training to get the most of what I'm going to explain below.

The thing is easy to use, install Python on your computer, use the command line, and very quickly, you get your own Invoke training UI:

It requires a dataset, I think that is basically the most important part of all there is to know about creating LoRAs.

You'll need a set of images. You can generate them via Invoke. My tips is to heavily lean on ChatGPT to get a working prompt of your character, don't bother trying to deal with the tags yourself.

If you don't like the result, just iterate with ChatGPT until you have a positive and negative prompt that gets you a consistent character of your liking.

====

Example of Invoke gallery generated via a ChatGPT prompt.

====

In your Invoke training UI, you'll need to:

- Give the path to your model (the one you actually use in Invoke).
- Give the path to your dataset config file (see below)
- Give some meta parameters for the model.

Now, when training a model, it will try to learn the key feature of your dataset, here our character, but it will need info about how to isolate that character from the noise (the noise being: the background, the clothes if the character can change outfit, the position, etc.).

To do that, you need to label your dataset accordingly. Here is a Python script I wrote that does that for you. You create folders whose names are the tags for the image in the folder, you fill those folders with image, change the path in the script, and run the script.
It will create a "dataset.jsonl" file, whose path your can past in the Invoke training UI.

====

Example of folder structure.

====

Example of folder content.

====

Python:

import os
import json

# Set the folder you want to scan
root_folder_path = "<yourpath>/invoke-training/datasets/<yourDatasetRootFolder>"  # Change this to your actual folder path
output_file = "dataset.jsonl"

file_entries = []

for root_folder_entry in os.listdir(root_folder_path):
    entry_path = os.path.join(root_folder_path, root_folder_entry)
    print(root_folder_entry);

    if os.path.isdir(entry_path):
        # List all files in the folder
        for filename in os.listdir(entry_path):
            file_path = os.path.join(root_folder_entry, entry_path, filename)
            if os.path.isfile(file_path) and filename.endswith(".png"):
                entry = {"filename": os.path.join(entry_path, filename), "text": "<yourCharacterTag>"}
                file_entries.append('{"image": "%s", "text": "oc_mia, %s", "mask": null}\n' % ((root_folder_entry + "/" + filename), root_folder_entry))


# Write to JSON
with open(root_folder_path + "/" + output_file, "w", encoding="utf-8") as f:
    for entry in file_entries:
        f.write(entry)

print(f"{len(file_entries)} files processed and written to {output_file}")

The script.
Change the path and the character tag: "<yourCharacterTag>". You'll use that tag as the keyword to identify your character in Invoke later.

You'll need about 20+ images, and you need those images to show your character in multiple setups. The reason is because if you have only 1 setup (only standing, only with a grey background, etc.) then that setup is not isolable from the character itself. We want to make sure that the only our key features are present in every images, in our case: the character face, the character colors and the character shape.

Everything else changes from image to image.
Everything else is captioned with tags (very important).

Execute the Python script.
Give the path to the created dataset.jsonl to your Invoke training UI.

About the meta-parameters:

- AdamW with a learning rate of 0.0005
People will bullshit about how the training rate is dependant of your dataset, how you need to test many values. In our case, we don't care, it's a LoRAs dataset of about 20-40 images, 0.0005 will do for you.

- The LoRAs rank dim, about 32, it's basically the size of the model. Too small and it won't abstract your character correctly, too large and it will overfit the data.
Same thing, don't bother looking for the perfect value, we've a consistent task: learn a character out of 20-40 images. 32 will do for you. You can lower it a bit if you want, it will work down to 16, but I prefer higher values.

- Number of epochs is not important, put 100, but create a checkpoint every epoch. You'll only use the latest, but by creating a checkpoint every epoch, you can stop the training whenever you feel like you're done.

- Don't bother with validation, put a high number for the number of epoch per validation, validation won't work on a LoRAs. I'll explain below why.

- Other parameters are irrelevant for us.

Once you're set, you can push the training button. The logs are in the console that you used to start the Invoke training UI (took me a while to notice it, lol).
Same thing, check the basic tutorials about it. Essentially it will create checkpoints in the "output" folder.

You can use those files as LoRAs in Invoke, as you would any other LoRAs.

Okay, now the big deal.
How do you actually use that LoRAs?

LoRAs is not going to learn your character perfectly. It's a mini-model that you trained locally. As a standalone, it's shit.

====
Example of image relying only on the LoRAs.

So now you'll say: but that's not the initial character.
No it's not. LoRAs are dumb shits, and getting the right dataset requires long trial and errors. This checkpoint is also under-fitted, the longer you leave your model running, the higher chances you'll have to get a good checkpoint.

Now, for this post, I'll keep this checkpoint, since I didn't have time to finetune more. It's not a big deal.

You can adjust the character by adding some of the original tags, the major problems are usually the hair and eyes colors, since they don't take a lot of space in the image, the model loss on those is negligeable and it doesn't learn them well.

The colors are bad too, the contrast is low. That's because the model is averaging the color to minimize the overall error among all images. It's hard and long to do much better if you're a newbie, but the idea is that: it's not important.

We'll use the base model to fix it all of that for us.

====

So what's the secret to make it work? It's to layer your LoRAs into an already generated image.
The generated image with your full model will ground the quality.

So first:

Generated a stand-alone environment image
Using an additional raster painting, put a big spot of skin-color where you want your character to be
Generate a character WITHOUT your LoRAs, but using similar tags to the one you used to generate the character, both negative prompt and positive prompt (girl, no extra limb, super res, whatever).

You can create a regional hint twice at the same spot to create both a positive and a negative prompt. One regional hint can't have both a positive and negative prompt, I don't know why they did it that way.
Technically you can create the character with the environment all-in-one, but I prefer to do it in two steps, especially if I want multiple characters. But that's really up to you in the end.
Now you have a high quality setup with an environment and a character doing what you what, placed how you want. But it's not the correct character. Damn.
So here is the trick, you put a paint mask on the character and a regional area with your character keyword. Don't redo the whole character at once! First do only the face, for example. Then only the clothes. Step by step will help you keep consistent art and quality.

Each time you validate a step, keep the result as a raster layer.
Important trick, put a very very low weight on your LoRas, like 0.2 and redo the generation several times. The generated part will tend toward your character, but the quality will mostly be the main model instead of your dumb shit LoRAs-level quality.

Here is an example of the full workflow:

Environment only, LoRAs is off.

Grounding for a character lying in the bed. LoRAs is still off.

There are:
1 mask to paint only on the bad.
1 positive prompt.
1 negative prompt.
1 raster layer for the body shape on the bed.

I tweaked a bit the prompt and I emptied the environment prompt, as it was creating too many tags for the model to handle the weights correctly. As a result, I just used the main prompt altogether, but regionals would have achieved the same result.

OK so now we have a positioned character, we just want to swap her with our LoRAS now. We'll reduce the mask to her face, and we'll put a regional hint on her head with our character tag.
I think the layer order is important, so put your new regional layer above the one describing the overall character setup.

I replayed a bit with the raster because the base model was too thin, because of my first stick drawing. Replaying with the new girl as the raster layer improves her over time.

Now we can enable our LoRAs (finally), with a weight of 0.2-0.7. We want the image to move toward your character, slowly but steadily, part by part.

This is a whole game of playing with the value of our LoRAs, as well as with the CFG scaling. To be honest, just try values, sometimes you'll want to boost one big time toward your characters, toward you'll want to just move a bit. It's all up to you. I did not need to play with the prompt itself, except for hair and eyes color. It's mostly about the LoRAs now.

Once we are there, we can move to the other body parts.

Legs:

You can reinforce the character by adding some of the original tags (relying on the base model more), and you can discard the bad generations, you don't have to keep them all.

Now the cleanup for the artefacts, we'll mask all the area around the character, while touching the character as little as possible, and redraw using the initial environment prompt.

Final result.

So, it's not perfect by any mean, firstly because I did those only for this post and I don't want to regenerate 100 times until it's perfect, and because I'm using an underfitted checkpoint.

Also, the character is small, but my training set didn't have "far away" examples, meaning that it doesn't know well what she is supposed to look like, but that can be improved by adding more examples in the dataset.

But the goal was to explain the workflow I developped, from OC idea to rendering the character into a concept, so here we are.

Going further, things I need to look first but I know can help:

- Learning the various noise generators. The one we usually use destroys the image completely and regenerate it almost from scratch, which is not always ideal for small adjustments. I believe some some generators may be less brutal and help preset more of the original parts into the new image.
- Learning about CFG Scale. Apparently it helps to force adherance to the prompt, or let the model do its best. This may help to force the initial character setting within an already existing environment.

edit:

I trained the checkpoint over night, so now, still applying exactly what I explained about the training of the LoRAs, here is the result of an Invoke using only the LoRAs:

So from here on, that character can safely be re-applied in actual invokes using the layered method explained above.

NoTraceOfLuck · Jun 24, 2025

FemdomWifeGame said:
Hello,

I wanted to thank you for this wonderful tutorial that got me started. In just a couple of days, I got all the info I needed.

As a thanks, I just wanted to repay with a bit of my knowledge. Since AI is only good if you can get consistent results, I almost immediately started toying with LoRas and creating custom characters.

So about creating your own LoRas and your own characters, here is my take.

There is a thing called Invoke training:

You must be registered to see the links

It's the counter-part of Invoke, it's dedicated to training your own LoRAs, to then be able to use them in Invoke.

I'm not going to walk everybody step by step through everything. This explanation is not enough, and you'll need to watch some basic tutorials about Invoke training to get the most of what I'm going to explain below.

The thing is easy to use, install Python on your computer, use the command line, and very quickly, you get your own Invoke training UI:

View attachment 4975695

It requires a dataset, I think that is basically the most important part of all there is to know about creating LoRAs.

You'll need a set of images. You can generate them via Invoke. My tips is to heavily lean on ChatGPT to get a working prompt of your character, don't bother trying to deal with the tags yourself.

If you don't like the result, just iterate with ChatGPT until you have a positive and negative prompt that gets you a consistent character of your liking.

==== View attachment 4975778
Example of Invoke gallery generated via a ChatGPT prompt.

====

In your Invoke training UI, you'll need to:

- Give the path to your model (the one you actually use in Invoke).
- Give the path to your dataset config file (see below)
- Give some meta parameters for the model.

Now, when training a model, it will try to learn the key feature of your dataset, here our character, but it will need info about how to isolate that character from the noise (the noise being: the background, the clothes if the character can change outfit, the position, etc.).

To do that, you need to label your dataset accordingly. Here is a Python script I wrote that does that for you. You create folders whose names are the tags for the image in the folder, you fill those folders with image, change the path in the script, and run the script.
It will create a "dataset.jsonl" file, whose path your can past in the Invoke training UI.

====

View attachment 4975755
Example of folder structure.

====

View attachment 4975761
Example of folder content.

====

Python:

import os import json # Set the folder you want to scan root_folder_path = "<yourpath>/invoke-training/datasets/<yourDatasetRootFolder>" # Change this to your actual folder path output_file = "dataset.jsonl" file_entries = [] for root_folder_entry in os.listdir(root_folder_path): entry_path = os.path.join(root_folder_path, root_folder_entry) print(root_folder_entry); if os.path.isdir(entry_path): # List all files in the folder for filename in os.listdir(entry_path): file_path = os.path.join(root_folder_entry, entry_path, filename) if os.path.isfile(file_path) and filename.endswith(".png"): entry = {"filename": os.path.join(entry_path, filename), "text": "<yourCharacterTag>"} file_entries.append('{"image": "%s", "text": "oc_mia, %s", "mask": null}\n' % ((root_folder_entry + "/" + filename), root_folder_entry)) # Write to JSON with open(root_folder_path + "/" + output_file, "w", encoding="utf-8") as f: for entry in file_entries: f.write(entry) print(f"{len(file_entries)} files processed and written to {output_file}")

The script.
Change the path and the character tag: "<yourCharacterTag>". You'll use that tag as the keyword to identify your character in Invoke later.

You'll need about 20+ images, and you need those images to show your character in multiple setups. The reason is because if you have only 1 setup (only standing, only with a grey background, etc.) then that setup is not isolable from the character itself. We want to make sure that the only our key features are present in every images, in our case: the character face, the character colors and the character shape.

Everything else changes from image to image.
Everything else is captioned with tags (very important).

Execute the Python script.
Give the path to the created dataset.jsonl to your Invoke training UI.

About the meta-parameters:

- AdamW with a learning rate of 0.0005
People will bullshit about how the training rate is dependant of your dataset, how you need to test many values. In our case, we don't care, it's a LoRAs dataset of about 20-40 images, 0.0005 will do for you.

- The LoRAs rank dim, about 32, it's basically the size of the model. Too small and it won't abstract your character correctly, too large and it will overfit the data.
Same thing, don't bother looking for the perfect value, we've a consistent task: learn a character out of 20-40 images. 32 will do for you. You can lower it a bit if you want, it will work down to 16, but I prefer higher values.

- Number of epochs is not important, put 100, but create a checkpoint every epoch. You'll only use the latest, but by creating a checkpoint every epoch, you can stop the training whenever you feel like you're done.

- Don't bother with validation, put a high number for the number of epoch per validation, validation won't work on a LoRAs. I'll explain below why.

- Other parameters are irrelevant for us.

Once you're set, you can push the training button. The logs are in the console that you used to start the Invoke training UI (took me a while to notice it, lol).
Same thing, check the basic tutorials about it. Essentially it will create checkpoints in the "output" folder.

View attachment 4975790

You can use those files as LoRAs in Invoke, as you would any other LoRAs.

Okay, now the big deal.
How do you actually use that LoRAs?

LoRAs is not going to learn your character perfectly. It's a mini-model that you trained locally. As a standalone, it's shit.

====
Example of image relying only on the LoRAs.
View attachment 4975996

So now you'll say: but that's not the initial character.
No it's not. LoRAs are dumb shits, and getting the right dataset requires long trial and errors. This checkpoint is also under-fitted, the longer you leave your model running, the higher chances you'll have to get a good checkpoint.

Now, for this post, I'll keep this checkpoint, since I didn't have time to finetune more. It's not a big deal.

You can adjust the character by adding some of the original tags, the major problems are usually the hair and eyes colors, since they don't take a lot of space in the image, the model loss on those is negligeable and it doesn't learn them well.

The colors are bad too, the contrast is low. That's because the model is averaging the color to minimize the overall error among all images. It's hard and long to do much better if you're a newbie, but the idea is that: it's not important.

We'll use the base model to fix it all of that for us.

====

So what's the secret to make it work? It's to layer your LoRAs into an already generated image.
The generated image with your full model will ground the quality.

So first:

Generated a stand-alone environment image

Using an additional raster painting, put a big spot of skin-color where you want your character to be

Generate a character WITHOUT your LoRAs, but using similar tags to the one you used to generate the character, both negative prompt and positive prompt (girl, no extra limb, super res, whatever).

You can create a regional hint twice at the same spot to create both a positive and a negative prompt. One regional hint can't have both a positive and negative prompt, I don't know why they did it that way.
Technically you can create the character with the environment all-in-one, but I prefer to do it in two steps, especially if I want multiple characters. But that's really up to you in the end.

Now you have a high quality setup with an environment and a character doing what you what, placed how you want. But it's not the correct character. Damn.

So here is the trick, you put a paint mask on the character and a regional area with your character keyword. Don't redo the whole character at once! First do only the face, for example. Then only the clothes. Step by step will help you keep consistent art and quality.

Each time you validate a step, keep the result as a raster layer.
Important trick, put a very very low weight on your LoRas, like 0.2 and redo the generation several times. The generated part will tend toward your character, but the quality will mostly be the main model instead of your dumb shit LoRAs-level quality.

Here is an example of the full workflow:

Environment only, LoRAs is off.

View attachment 4976016

Grounding for a character lying in the bed. LoRAs is still off.

There are:
1 mask to paint only on the bad.
1 positive prompt.
1 negative prompt.
1 raster layer for the body shape on the bed.

View attachment 4976352

I tweaked a bit the prompt and I emptied the environment prompt, as it was creating too many tags for the model to handle the weights correctly. As a result, I just used the main prompt altogether, but regionals would have achieved the same result.

View attachment 4976354

OK so now we have a positioned character, we just want to swap her with our LoRAS now. We'll reduce the mask to her face, and we'll put a regional hint on her head with our character tag.
I think the layer order is important, so put your new regional layer above the one describing the overall character setup.

I replayed a bit with the raster because the base model was too thin, because of my first stick drawing. Replaying with the new girl as the raster layer improves her over time.

View attachment 4976358

Now we can enable our LoRAs (finally), with a weight of 0.2-0.7. We want the image to move toward your character, slowly but steadily, part by part.

This is a whole game of playing with the value of our LoRAs, as well as with the CFG scaling. To be honest, just try values, sometimes you'll want to boost one big time toward your characters, toward you'll want to just move a bit. It's all up to you. I did not need to play with the prompt itself, except for hair and eyes color. It's mostly about the LoRAs now.

View attachment 4976361
View attachment 4976364
View attachment 4976365
View attachment 4976366

Once we are there, we can move to the other body parts.

View attachment 4976369
View attachment 4976370

Legs:
View attachment 4976371
View attachment 4976372

You can reinforce the character by adding some of the original tags (relying on the base model more), and you can discard the bad generations, you don't have to keep them all.

Now the cleanup for the artefacts, we'll mask all the area around the character, while touching the character as little as possible, and redraw using the initial environment prompt.

View attachment 4976375

View attachment 4976381
Final result.

So, it's not perfect by any mean, firstly because I did those only for this post and I don't want to regenerate 100 times until it's perfect, and because I'm using an underfitted checkpoint.

Also, the character is small, but my training set didn't have "far away" examples, meaning that it doesn't know well what she is supposed to look like, but that can be improved by adding more examples in the dataset.

But the goal was to explain the workflow I developped, from OC idea to rendering the character into a concept, so here we are.

Going further, things I need to look first but I know can help:

- Learning the various noise generators. The one we usually use destroys the image completely and regenerate it almost from scratch, which is not always ideal for small adjustments. I believe some some generators may be less brutal and help preset more of the original parts into the new image.
- Learning about CFG Scale. Apparently it helps to force adherance to the prompt, or let the model do its best. This may help to force the initial character setting within an already existing environment.

Great post!

Something you might want to try is resizing the bounding box so that it only covers the area around the character. This helps quite a bit to improve the quality of "far away" characters.

I had a similar scene. This is what my image looked like without and with changing the bounding box size. Lowering the size gives you some extra detail on the character:

Original image ------------------------------------------------------------------------- Bounding Box Adjusted

The details on how to do this are in this part of my original post: https://f95zone.to/threads/your-noob-to-expert-guide-in-making-great-ai-art.256631/post-17104572

FemdomWifeGame · Jun 24, 2025

NoTraceOfLuck said:
Great post!

Something you might want to try is resizing the bounding box so that it only covers the area around the character. This helps quite a bit to improve the quality of "far away" characters.

I had a similar scene. This is what my image looked like without and with changing the bounding box size. Lowering the size gives you some extra detail on the character:

Original image ------------------------------------------------------------------------- Bounding Box Adjusted
View attachment 4976578 View attachment 4976580

The details on how to do this are in this part of my original post: https://f95zone.to/threads/your-noob-to-expert-guide-in-making-great-ai-art.256631/post-17104572

Ah, neat trick. I didn't go that far, basically for each part of your post, I ended up trying by myself. So when I reached the LoRAs, well... It went a bit rabbit hole from that point onward haha.

I need to read the rest now

FemdomWifeGame · Jun 25, 2025

OK so I've been toying around a bit.

The BBox tool is indeed very helpful for details, hands, eyes, etc.

Another tool I discovered that is amazing is the noise amount, here:

Then in your inpaint mask:

This thing allows you to say "don't reset what's inside the mask completely by replacing it with 100% noise, instead add only 37% noise".

So you can adjust an image 5% of noise at a time, or 50%, etc., depending of how much you expect the image to be modified. This is excellent for cloths, hair, iterating positions, etc.

Notably it helps a lot with the monochrome patchs of color from hand-painted raster layer. Just iterate on with 50% noise at a time, that's how I did the legs' positions below.

Here are some results:

There are problems, things I could clean. I don't really care at the moment about going too far, it's just to learn the main concepts.

NoTraceOfLuck · Jun 25, 2025

FemdomWifeGame said:
OK so I've been toying around a bit.

The BBox tool is indeed very helpful for details, hands, eyes, etc.

Another tool I discovered that is amazing is the noise amount, here:

View attachment 4978899

Then in your inpaint mask:

View attachment 4978901

This thing allows you to say "don't reset what's inside the mask completely by replacing it with 100% noise, instead add only 37% noise".

So you can adjust an image 5% of noise at a time, or 50%, etc., depending of how much you expect the image to be modified. This is excellent for cloths, hair, iterating positions, etc.

Notably it helps a lot with the monochrome patchs of color from hand-painted raster layer. Just iterate on with 50% noise at a time, that's how I did the legs' positions below.

Here are some results:

View attachment 4979191

There are problems, things I could clean. I don't really care at the moment about going too far, it's just to learn the main concepts.

Those are some very impressive images! Almost doesnt look like AI!

FemdomWifeGame · Jun 26, 2025

NoTraceOfLuck said:
Those are some very impressive images! Almost doesnt look like AI!

Thanks! To be honest, it's all thanks to your tutorial. I would never have been able to figure it out on my own.

Here is the finalized version. I'll use it in my game as side-content.

I cleaned a lot of incoherence. I may have missed some still, but honestly, I think it's good enough.

Kameronn77x · Jun 27, 2025

I'm literally brand new to all of this, but the two of you just blew my mind on both how complicated and simple this is when it's just explained with normal wording or analogies. Like, I just saw the guide and clicked out of curiousity, but I think I'll play with it this weekend now. Best of luck to both of you on your games!

goblingodxxx · Jun 29, 2025

NoTraceOfLuck said:
Hey all, this is a guide I have wanted to make for a long time. I have learned so much about AI art while creating my game and figured it was time to share the knowledge.

Disclaimer: This guide is OPINIONATED! That means, this is how I make AI art. This guide is not "the best way to make AI art, period." There are many MANY AI tools out there and this guide covers only a very small number of them. My process is not perfect.

Hardware Requirements:

The most important spec in your PC when creating AI art is your GPU's VRAM. It really doesn't matter how old your GPU is (though newer ones will be faster), the limiting factor on what you can and cannot do with AI is almost always going to be your GPU's VRAM.

This guide may work with as little as 4gb of VRAM, but in general, it is recommended that you have at least 12gb, with 16gb being preferred.

No hardware? No problem:

If you do not have a good GPU, or just want to try some things out before buying one, the primary tool that I use in this tutorial offers a paid online service. It is the exact same tool it just runs on the website and costs money per month.

You can check it out here:
You must be registered to see the links

GPU Buying Guide:

Buying Nvidia will be the most headache free way to generate AI art, though it is generally possible to make things work on AMD cards with some effort. This guide will not cover any steps needed to make things work on AMD GPUs, though the tools I use all claim to support AMD as well.

On a tight budget Used RTX 4060 TI (16gb VRAM) This card is modern, reasonably fast, and has 16gb of VRAM
Middle of the road RTX 5070 TI (16gb VRAM) This has the 16gb of VRAM, but will be signifiicantly faster than a 4060ti
High VRAM on a budget Used RTX 3090 (24gb VRAM) If you want 24gb of VRAM to unlock higher resolutions and the possibility of video generation, the RTX 3090 is the most reasonable option
Maximum power RTX 5090 (32gb VRAM) If you have deep pockets the RTX 5090 has the most VRAM of any consumer card and is much faster than the RTX 4090

The RTX 4090 is a great card, but prices are extremely high right now. If you can find a deal, that's another good buy.

Installation and Setup:

The tool I will use in this tutorial is called Invoke. It has both a paid online version, and a free local version that runs on your computer. I will be using the local version, but everything in this tutorial also works in the online version.

Website:
You must be registered to see the links

These steps specifically are how to install the local version. If you are using the online version, you can skip all of these steps.

Download the latest version of Invoke from here:
You must be registered to see the links

View attachment 4879763

Run the file you downloaded. It will ask you questions about your hardware and where to install. Continue until it is installed successfully

If you have a low VRAM GPU (8gb or less) to greatly improve speed, follow these additional steps:
You must be registered to see the links

Click Launch

Now, you will get a window like this:

View attachment 4879783

Understanding Models

Now, the most important part of AI generation: selecting a model. What is a model? I will spare you the technical details, most of which I don't understand either. Here's what you need to know about models:

Your model determines how your image will look.

If you get an anime model, it will generate anime images

If you get a realism model, it will generate images that look like a real photograph

Each model "understands" different things.

One model might interpret the prompt "Looking at camera" as having the main character in the image make eye contact with the viewer

A different model might interpret the prompt as having the main character literally look at a physical camera object within the scene

Your base model is the most important thing in determining how your images will look. Here are some links to some example models (note, there are thousands and thousands of models available.)

Anime Models

You must be registered to see the links

This is a popular anime model.

You must be registered to see the links

This is also an anime model, however it produces a different style of illustration from the other model.

You must be registered to see the links

This anime model produces images in more of a '3D style'

Realism Models

You must be registered to see the links

This is the most popular realism model. However, I will have a section below specifically on Flux which covers some things you will need to know before using it.

You must be registered to see the links

While realism models don't technically have different 'styles' like anime does, it is important to note that different realism models produce different styles of realism. Some models might be better at creating old people. Some might produce exclusively studio photography style images. Some might produce more amateur style images of lower quality.

You must be registered to see the links

Generating Your First Image

Alright, with all that new knowledge in your head, I will provide a recommended model for the remainder of this tutorial.

We will use
You must be registered to see the links
which is a very popular anime model that is based on Illustrious.

To download this, you will require an account on Civitai. Civitai is the primary space in which users in the AI community share models. Create an account and then continue on with this tutorial.

After you've created an account, to install this model, right-click here, and click 'Copy Link'

View attachment 4879958

Now, go back to Invoke and click here:

View attachment 4879960

Then, paste the link here, and click Install:

View attachment 4879968

Most models are around 6gb, however Flux is around 30gb.

When it is done, you will see it here:

View attachment 4879973

Now go back to the canvas by clicking here:

View attachment 4879976

You will see the model has been automatically selected for you. But if you chose to install other models too, you can select the model here:
View attachment 4879980

Now, enter these prompts:

Positive Prompt

masterpiece, best quality, highres, absurdres, hatsune miku, teal bikini, outdoors, beach, sunny, sand, ocean, sitting, straight on, umbrella, towel, feet

Negative Prompt

bad quality, worst quality, worst aesthetic, lowres, monochrome, greyscale, abstract, bad anatomy, bad hands, watermark

View attachment 4879997

And click 'Invoke'

Congratulations You have made your first image:

View attachment 4880004

Now, you can create great AI art using only what you've seen so far and you're free to stop and experiment here. However, this is only the beginning of what you can do with AI.

In part 2, I will start to get into more tools and options you have available.

Is it free or paid?

osanaiko · Jun 29, 2025

goblingodxxx said:
Is it free or paid?

There's 1131 words in that post and several different tools or websites mentioned... What exactly is you question referring to?

jonUrban · Jun 29, 2025

goblingodxxx said:
Is it free or paid?

If you're talking about Invoke, the OP says you can run the community edition locally for free, just download and install.
For all the checkpoints (models), they're free. The Stable Diffusion community (r/stable diffusion) pretty much eschews non-FOSS software for local generation, online generation is a different matter for obvious reasons.

yoohen · Jul 8, 2025

thanks, it's really helpful,

I need similar detailed tutorial for ai video.

Duhan02 · Jul 15, 2025

Literally just stumbled on this thread by randomly checking this section. Saw the guide at the perfect time. The only thing is, all I have is a Steam Deck ahhh

On a tight budget	Used RTX 4060 TI (16gb VRAM)	This card is modern, reasonably fast, and has 16gb of VRAM
Middle of the road	RTX 5070 TI (16gb VRAM)	This has the 16gb of VRAM, but will be signifiicantly faster than a 4060ti
High VRAM on a budget	Used RTX 3090 (24gb VRAM)	If you want 24gb of VRAM to unlock higher resolutions and the possibility of video generation, the RTX 3090 is the most reasonable option
Maximum power	RTX 5090 (32gb VRAM)	If you have deep pockets the RTX 5090 has the most VRAM of any consumer card and is much faster than the RTX 4090 The RTX 4090 is a great card, but prices are extremely high right now. If you can find a deal, that's another good buy.

Your noob to expert guide in making great AI art.

New Member

Active Member

New Member

Active Member

Newbie

Newbie

Active Member

Engaged Member

Active Member

Active Member

Active Member

Active Member

Active Member

Active Member

Newbie

New Member

Engaged Member

New Member

Member

New Member