She's waiting...ready to tease you live - Jerkmate is free! Join Now!
x

Generating AI images with StableDiffusion - Beginner Guide | and with img2vid

5.00 star(s) 2 Votes

leerlauf

Newbie
Dec 13, 2019
42
15
85
thanks for the tutorial it was awesome, helped a lot!

just started thinkering myself and had a couple of questions

is it better to have less prompts or the more detailled promtp possible ?

what are the best settings to done down to speed up image generation and mitigating the downgrade of the generated pictures? i want to dablle till i find the characters i 'like.

what's the process used from authors when generating IA to have consistantly the same characters face ?
for exemple when i find a characters i like through the IA generations, how can i make variants of it efficiently to use in a story?

If it's already answered just tell me where i can search thanks !
From my own experience it is far more important to have the correct prompts (the ones that correspond most closely to the training data), rather than having a lot. You still might need a lot of prompts depending on the number of concrete details you need, but for more simple and straightforward tasks it's best to keep things simple - and for really complex stuff you might just need to use photoshop and a subsequent image to image generation.

When it comes to character consistency, your best bet is creating a character Lora. There are already a bunch of free character Loras around that you could use, and with certain styles you can get a consistent character just from prompting (and many checkpoints can already do popular anime/cartoon characters because of their training data), but if you have something very specific in mind you might just need to train a Lora yourself, or commission somebody to do it for you.
 
  • Like
Reactions: idontjudgebro

idontjudgebro

Newbie
Jul 9, 2024
56
138
119
From my own experience it is far more important to have the correct prompts (the ones that correspond most closely to the training data), rather than having a lot. You still might need a lot of prompts depending on the number of concrete details you need, but for more simple and straightforward tasks it's best to keep things simple - and for really complex stuff you might just need to use photoshop and a subsequent image to image generation.

When it comes to character consistency, your best bet is creating a character Lora. There are already a bunch of free character Loras around that you could use, and with certain styles you can get a consistent character just from prompting (and many checkpoints can already do popular anime/cartoon characters because of their training data), but if you have something very specific in mind you might just need to train a Lora yourself, or commission somebody to do it for you.
thks for the answers i appreciate it

Is creating a Lora complicated? it seems like it the way you wrote that we can commision someone to do it
 

leerlauf

Newbie
Dec 13, 2019
42
15
85
thks for the answers i appreciate it

Is creating a Lora complicated? it seems like it the way you wrote that we can commision someone to do it
I haven't created a Lora myself yet, but from what I have heard it's not too complicated. Might take you some trial and error to get it right. The first step would be to create some good training data - meaning a bunch of different pictures of the character you want in different poses and angles. Use photoshop and img to img to fix these pics up until they look like they depict the same character (I heard you need about 30 for a character Lora - but again - that's must me quoting some guides I read online - haven't done it myself yet).

You can also look around at the commission section and see if you can find an AI-creator who can train a Lora for you, if you don't feel up for it yourself.
 

idontjudgebro

Newbie
Jul 9, 2024
56
138
119
I haven't created a Lora myself yet, but from what I have heard it's not too complicated. Might take you some trial and error to get it right. The first step would be to create some good training data - meaning a bunch of different pictures of the character you want in different poses and angles. Use photoshop and img to img to fix these pics up until they look like they depict the same character (I heard you need about 30 for a character Lora - but again - that's must me quoting some guides I read online - haven't done it myself yet).

You can also look around at the commission section and see if you can find an AI-creator who can train a Lora for you, if you don't feel up for it yourself.
thanks for the helpful tips again it really helps

another question,i want to add text messages to my novel but i'mstruggling to adapt the size to the UI

i'm using "yet another phone for renpy"

is there a default config ? furthermore the default textbox in the zipfile on the official git only contains a little sized one so what is writtent just spill out of the box how can i fix that?
 

Psan2022

Member
Mar 8, 2022
111
168
167
#1 Addition to generating images with Stable Diffusion
(How to make videos from your images)

So you have been generating images with Stable Diffusion but you want to make short videos from them? I have a little something for you. This method only uses 6GB of VRAM so you could theoretically use weaker Graphics Card.

What you need:
1. Download FramePack ->
2. Preferably a good graphics card and/or processor is very recommended. But if you do not have one, it should also work as long it has at least 6GB VRAM.

Installation:
1. Downlolad the archive under the Installation section. Unpack the zip folder and put the unpacked folder somewhere with a lot of free space. You need at least 40 GB free space later on!
2. After you have your folder in a desired path klick on the update.bat. Now it will download everything it needs to function. That may take a while since it will download about 30-40 GB of data.
3. After it updated you can close the cmd window and open the run.bat.
4. It will open an browser window. (Similar to Stable Diffusion)


Overview:
1751143249246.png

UI:

It is very similar to what you know from Stable Diffusion
- In the top left you post your desired picture
- In the top right you have your endproduct
- In Prompt you give a description what you want.
- You can choose a video length of up to 120 seconds (2 minutes)
- In Seed you can put in whatever you want to randomize your output.
- Steps Determine the sampling rate of the generation (the more you choose the longer it will take to generate)
- Distilled CFG Scale is how much you let your pc free to decide the outcome
- MP4 Compression is how the image quality of your video will be

How to generate:
It is quite straight forward.
1. You generate a picture with Stable Diffusion or with another Txt2Img Generator.
2. You put your image in the left upper corner (you can also put in photos and other pictures too).
3. Set the prompt with what is going on in the image.
4. Select a length of the video
5. Press generate.

Here is what I got with just 2 prompts:


It could use some work. But for a quick an dirty way. It does not seem bad at all.
You can find your outputs in:
YourDirectory/framepack_cu126_torch26\webui\outputs

I hope you found that little excursion as interesting as I have.
 
Last edited:

lordbolton

Newbie
Sep 8, 2017
33
28
197
copied from Civitai
Wan Self Forcing Rank 16 (Accelerator)
You don't have permission to view the spoiler content. Log in or register now.



I cannot believe this lora works cuts the video generation time to 3 minutes to generate a video, before it was 25 minutes.
 

pussydestroya

Newbie
Mar 7, 2018
45
59
161
#1 Addition to generating images with Stable Diffusion
(How to make videos from your images)

So you have been generating images with Stable Diffusion but you want to make short videos from them? I have a little something for you. This method only uses 6GB of VRAM so you could theoretically use weaker Graphics Card.

What you need:
1. Download FramePack ->
2. Preferably a good graphics card and/or processor is very recommended. But if you do not have one, it should also work as long it has at least 6GB VRAM.

Installation:
1. Downlolad the archive under the Installation section. Unpack the zip folder and put the unpacked folder somewhere with a lot of free space. You need at least 40 GB free space later on!
2. After you have your folder in a desired path klick on the update.bat. Now it will download everything it needs to function. That may take a while since it will download about 30-40 GB of data.
3. After it updated you can close the cmd window and open the run.bat.
4. It will open an browser window. (Similar to Stable Diffusion)


Overview:
View attachment 4990509

UI:

It is very similar to what you know from Stable Diffusion
- In the top left you post your desired picture
- In the top right you have your endproduct
- In Prompt you give a description what you want.
- You can choose a video length of up to 120 seconds (2 minutes)
- In Seed you can put in whatever you want to randomize your output.
- Steps Determine the sampling rate of the generation (the more you choose the longer it will take to generate)
- Distilled CFG Scale is how much you let your pc free to decide the outcome
- MP4 Compression is how the image quality of your video will be

How to generate:
It is quite straight forward.
1. You generate a picture with Stable Diffusion or with another Txt2Img Generator.
2. You put your image in the left upper corner (you can also put in photos and other pictures too).
3. Set the prompt with what is going on in the image.
4. Select a length of the video
5. Press generate.

Here is what I got with just 2 prompts:


It could use some work. But for a quick an dirty way. It does not seem bad at all.
You can find your outputs in:
YourDirectory/framepack_cu126_torch26\webui\outputs

I hope you found that little excursion as interesting as I have.
any fix for amd users?
seems to only run with nvidia drivers/gpus
 

Psan2022

Member
Mar 8, 2022
111
168
167
any fix for amd users?
seems to only run with nvidia drivers/gpus
Well there is a AMD fork for Frame Pack. Can't say how it works or if it works at all. But I recon its worth a try if you have an AMD GPU
 

Fawakes

New Member
Sep 9, 2025
1
0
1
I'm from Brazil, sorry for my bad English on Google Translate, but why are my images looking like this?

Stable Diffusion checkpoint : v1-5-pruned-emaonly.safetensors [6ce0161689]
Model:face_yolov8s.pt

1758238370496.png


1758238389143.png


Help Please
 

leerlauf

Newbie
Dec 13, 2019
42
15
85
I'm from Brazil, sorry for my bad English on Google Translate, but why are my images looking like this?

Stable Diffusion checkpoint : v1-5-pruned-emaonly.safetensors [6ce0161689]
Model:face_yolov8s.pt

View attachment 5261852


View attachment 5261854


Help Please
Well - you're using a 1.5 checkpoint but try to use pony prompts and a pony lora. Try the same prompt using some instead.
 

its_not_real

Member
Game Developer
May 14, 2023
110
308
179
Yeah... That model was the absolute first model they released for SD1.5 years ago.

Check out SDXL models (like pony mentioned above) or if you have a xx90 card (lots of mem) maybe even check out FLUX...
 

Psan2022

Member
Mar 8, 2022
111
168
167
I'm from Brazil, sorry for my bad English on Google Translate, but why are my images looking like this?

Stable Diffusion checkpoint : v1-5-pruned-emaonly.safetensors [6ce0161689]
Model:face_yolov8s.pt

Help Please

Well - you're using a 1.5 checkpoint but try to use pony prompts and a pony lora. Try the same prompt using some instead.
^ This and I assume you are using too high of a resolution for a SD 1.5 checkpoint. Your characters will duplicate and grow extra limbs and be grotesque like that when using too high of a resolution. If you need a bigger resolution use an upscaler after generating the image
 

JupiterSoda19

Newbie
May 22, 2025
20
14
22
[Character Generation WebUI

Syncs with Automatic1111 to manage characters, prompts, and generation sessions.

Quick Start
- Install Python 3.11 and launch StableDiffusion(API enabled you have to add in CLI args in webui-user.bat).
Code:
@echo off
set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--api
call webui.bat
- Copy the contents of the rar in the StableDiffusion folder
- Run `
Code:
custom\run_webui.bat
` to create the venv, install deps, and open Streamlit on ` `.

UI Highlights
- Prompt generator with style/rating/complexity/gender controls.
- Generation queue running Base → Pose → Clothing stages with live progress.
- Session dashboard plus Organize Gallery button to recover files for failed runs.


I made this tool to manage my generations and grouping them into my own folders and tracking some of the generations i make. This is made for PonyXL tags. Accesssing the URL you can create prompts which are cleaned up for the model you can randomize and generate prompts and then use it to generate images with this thread tutorial and then if you want to orgainze the files that were generated i have a gallery cleaner section if you go to session tab.
Screenshot 2025-10-14 220711.png Screenshot 2025-10-14 220729.png
 
Last edited:
  • Like
Reactions: Psan2022

leerlauf

Newbie
Dec 13, 2019
42
15
85
Are there better choices than going with ComfyUI?
You can use if you want something with a less complicated UI. It's not necessarily better than ComfyUI - in fact it is probably lagging behind when it comes to the integration of a couple of things - but it is a lot less of a hassle to use and set up. I personally find myself switching back and forth between Comfy and Forge depending on the task I need to do.
 
  • Like
Reactions: Horps

its_not_real

Member
Game Developer
May 14, 2023
110
308
179
Are there better choices than going with ComfyUI?
Depends what you mean by "better".
You can look up (I have not used it myself, I do all my images in forge)
I see no reason to switch from forge and I have no need to risk my system by downloading and running un-vetted python code (workflows and such) on my system, works with sd1.5, sdxl models (like pony and illustious), FLUX and Chroma. The dev of forge is also the creator of controlnet etc.
The only thing I use Comfy for is video generation.
 
  • Like
Reactions: Horps
Mar 3, 2025
64
93
27
I've been playing around with Perchance image generator, which is frustrating as it doesn't really remember or learn. It works on my Android tablets, though, so I can play with it while traveling. Despite its limitations, I've managed to produce some decent stuff.
 

giqui

Conversation Conqueror
Compressor
Nov 9, 2019
6,977
47,793
883
I've been playing around with Perchance image generator, which is frustrating as it doesn't really remember or learn. It works on my Android tablets, though, so I can play with it while traveling. Despite its limitations, I've managed to produce some decent stuff.
The problem with using online tools to generate images is that not everything can be generated; guidelines prevent any prompts that bring up anything related to nudity or sex. I use , which comes with Pytorch 2.4. It works very well on my simple graphics card, a 6 GB RTX 3050. NVIDIA cards are the most compatible for working with AI. It is easy to implement, comes with many additional tools, and fit perfectly into it. In addition, there are hundreds of models available on CIVITAI.

I ran a test on the website , typing: “A woman walking naked on the beach.” As expected, the guidelines blocked it.

Captura de tela 2025-10-29 084029.png
 
Mar 3, 2025
64
93
27
The problem with using online tools to generate images is that not everything can be generated; guidelines prevent any prompts that bring up anything related to nudity or sex. I use , which comes with Pytorch 2.4. It works very well on my simple graphics card, a 6 GB RTX 3050. NVIDIA cards are the most compatible for working with AI. It is easy to implement, comes with many additional tools, and fit perfectly into it. In addition, there are hundreds of models available on CIVITAI.

I ran a test on the website , typing: “A woman walking naked on the beach.” As expected, the guidelines blocked it.

View attachment 5386552
I've managed to get it to do some pretty explicit stuff, but it takes some word manipulation to do it. This seems to be a field where literary artistry is more useful than graphic.

Orc.jpeg download (47).jpeg download (72).jpeg download (76).jpeg download (97).jpeg
 
5.00 star(s) 2 Votes