[Stable Diffusion] Prompt Sharing and Learning Thread

Sharinel · Oct 20, 2025

theMickey_ said:
I've been using ComfyUI for quite a while, I've taught myself on how to create (simple) workflows to create images, upscale images, use ControlNet to use existing poses or even animations to reproduce this poses/animations with pre-defined images, do face replacements and all this stuff. All while using SD(XL) models. And it's pretty amazing what you can do, I love it! I even replaced my NVIDIA RTX 2080Ti with an NVIDIA RTX 4090 to be able to circumvent the limited VRAM of the 2080Ti. But that's when I stopped teaching myself new things. I don't know anything about Pony, Illustrious, Qwen or anything like that, I'm just seeing your posts and I'm... wow!

And now there's Image-2-Video and Text-2-Video using Wan 2.1/2.2, which I'm very interested in, but I'm totally lost! So here I am asking you guys if you can help me out.

First, I've started using the official Wan templates like "Wan 2.2 14B Text to Video" or "Wan 2.2 14B Image to Video" which can be found in the "Templates" section of the ComfyUI menu. I've downloaded the models for it, and while those workflows do work, they seem to be slow and "limited" when it comes to the length and resulution of the video.. And because you'll usually want to create a couple of videos with the same prompt to pick the best result, this might take many hours, maybe even a couple of days to get the video you're looking for. And it's only like 5 seconds long...

Next I was looking for some "optimized" workflows people share online, and first I found
You must be registered to see the links
set of workflows on civit.ai, and I've been trying it out. I do like the "WAN 2.2 I2V" workflow included, because it seems to be faster and has more options, but I still feel limited to when it comes to the resolution and length of the video because it uses ".safetenors" models which uses a lot of VRAM. I can still get 5 seconds videos with a decent resolution, or I can get a longer video with a poor resolution.

Then I thought I might go for GGUF models instead, because from what I understand, they do use less VRAM, but they are "compressed" and therefore might take longer. I don't mind waiting a couple of minutes for results if I can use more frames (= longer videos) or a higher resolution than with the "default" workflow. So I found
You must be registered to see the links
, which is very impressive, uses GGUF, has a bunch of options, and after downloading all the missing nodes and models (as well as fixing a "bug" in the workflow itself) it's producing decent results within a couple of minutes. I've been able to create a few videos of 20+ seconds (at 24 FPS) with a resolution of 480x800, but as soon as I add action prompts for the camera or the subject in the picture (btw: no additional LoRAs are involved), the video gets blurry (looks like a double- or even multi-exposures when talking about photographs) or it just doesn't follow the prompt (i.e. if the prompt says "the camera slowly zooms in toward the woman's face", it zooms-in for about 3 seconds, then zooms back out and repeats those steps until the end of the clip -- even if I add something like "at second 5, the camera stops completely and remains entirely static for the rest of the video. there is no zooming, panning, or movement after this point — the frame stays locked on her face.")

So here are my question:

What's your overall workflow to create a 10-20+ second high-resolution video based on your imgination/prompt?

The resulting video should be produced in a couple of minutes (5-15 minutes at most, not hours).

What's your Text-2-Image workflow you use to create your starting image?

What's your Image-2-Video workflow to produce a 10-20+ second video with a decent (720p) resolution?

What's your workflow to upscale the video to a HD resolution (1280p or even 1440p)?

What prompt (or LoRA) do you use to consistently "control" the camera movements (zoom in, zoom out, keep being static at a close-up etc.)

Any help is highly appreciated. I would love to end up with with like 3-4 workflows in total (1: create a starting/ending image for the video / 2: create an at least 10-20+ second video with "precise" camera movement / 3: upscale the video to at least 1280p).

TL;DR: if you share your workflows to create a 20+ seconds video with precise camera (and subject) actions, or are able to point me into the right direction where to research further, I will be in your dept forever

I'm running a 4090 as well and tbh I don't think it has the VRAM to do 10-20 secs videos. At the moment I'm running 5 sec videos at 768x768 or equivalent ratio that take 4 mins or so to produce. I've attached the json file I use, on the list of lora only the top one is needed, the rest are NSFW loras to do specific things.
The good thing about this workflow is that you get a final image which you can then use to kick off the next video, and it also uses interpolate to increase the fps/vid sizes. I'm sure if you've been downloading the models you probably have these.

Here is an example of the output

View attachment Deadwood Vibes Video 02.mp4

Sepheyer · Oct 20, 2025

theMickey_ said:
TL;DR: if you share your workflows to create a 20+ seconds video with precise camera (and subject) actions, or are able to point me into the right direction where to research further, I will be in your dept forever

Video right now is in the same state as Stable Diffusion 1.0 was a few years ago - even with all the extra tools it is limited in what you can do.

On the other hand image generation leapfrogged with Qwen Edit 2509 and probably is the area to focus on "today" because finally you are starting to get solid control over character consistency to the point you can easily throw together a visual novel with it. Even if you don't want a VN, the renders can be made to look like stills from a high-end movie or a photoshoot.

So yea, I'd say forget about WAN etc for now and see where QE2509 can take you.

Comfy has templates for Qwen, so do use them. There is about a week-long learning curve where you mostly throw prompts at it and learn how it reponds. Generally you want the 8 step lightning lora w/ Qwen - it offers great balance between speed and quality. Oh, and you do want to familiarize yourself with Illustrious, just because this is what you use to generate the characters that Qwen will end up composing into final scene renders.

Yea so what? The main selling point is Qwen finally gives folks a full movie set/studio at home. You generate a bunch of actresses/talen with Illustrious and then Qwen puts them into stories. Like each of the stills below in real life would take about $5,000 to produce - starting with talent pay, equipment rent, makeup artist, location scout etc, etc. Just as everybody else I did "student films" with friends and I do know that each of these scenes below would be well out of my reach in terms of money, time and effort. Yet Qwen finally gives folks like me a tool to "scratch that itch".

JhonLui · Oct 20, 2025

Afaik, long video= Framepack Studio.
It uses Hunyuan so it has its downsides, but also a lot of + (time control, lora suite, start-endframe setting, frame editing, good upscalers/joiner) and it's extremely fast (compared to the average). Also works with 6GB cards...

theMickey_ · Oct 20, 2025

Sharinel Sepheyer Thanks for sharing your workflow and your insights, I much appreciate it! I will have a closer look into your posts as soon as I find some spare time to do so, and will probably try to (re)create my own multi-step-workflow to get a 10+ seconds video based on your suggestions. Beginning with something like Illustrious to create some characters in high quality, use QE2509 (because character consistency) to compose a starting scene image with those characters and then use an I2V workflow to create the final video. I guess that's the way to go forward to get what I want.

And by creating my "own" workflow for each step (even if I end up with basically the exact same nodes/steps from your workflow or any existing templates), I'll hopefully understand how each individual workflow does work, which always has to be my goal.

Again, thank you very much for your inputs, I'll keep you posted (might take while though)!

kaamist · Oct 21, 2025

Any good LoRa to work on SD1.5 to give large decorative Golden or Diamond Ornaments or Jewelries ?? Here I get one good lora to get Thick hard pluffy nipples,,, Hope to get this one too !! ANy help kindly do consider !!

Sepheyer · Oct 21, 2025

Oh, ok, so one uses a python lib to combine multiple ST files into one:

-----------------------------

A question. So HuggingFace sometimes posts models that consist of multiple files, like so:

You must be registered to see the links

How do you use or combine these files into one file? Thanks!

JhonLui · Oct 21, 2025

Sepheyer said:
A question. So HuggingFace sometimes posts models that consist of multiple files, like so:

View attachment 5363241

You must be registered to see the links

How do you use or combine these files into one file? Thanks!

You don't, open "model_safetensor_index.json" with a text editor and it will be clear.
You have to install both of them.
Maybe they separated the indexes to make it lighter or faster.

Sepheyer · Oct 22, 2025

Anyone got a workflow for creating icons?

I tried using Qwen 2509 and OmniGen2, and I am still here asking this question

ElKaya · Oct 25, 2025

How do you create such almost realistic videos? I tried everything that was at the beginning of the page, installing everything, and when I try to use it, everything comes out distorted. I'm sure I missed something or didn't look carefully at how to do it.

devilkkw · Oct 26, 2025

Sepheyer said:
Anyone got a workflow for creating icons?

I tried using Qwen 2509 and OmniGen2, and I am still here asking this question

Time ago i made a

You must be registered to see the links

for sd1.5. I know is old, but if you need i tell you how train your own lora in these way.

Sharinel · Oct 27, 2025

ElKaya said:
How do you create such almost realistic videos? I tried everything that was at the beginning of the page, installing everything, and when I try to use it, everything comes out distorted. I'm sure I missed something or didn't look carefully at how to do it.

Distorted how? Can you put up a screenshot of what you are seeing? Then maybe a screenshot of your workflow?

Synalon · Oct 30, 2025

theMickey_ said:
I've been using ComfyUI for quite a while, I've taught myself on how to create (simple) workflows to create images, upscale images, use ControlNet to use existing poses or even animations to reproduce this poses/animations with pre-defined images, do face replacements and all this stuff. All while using SD(XL) models. And it's pretty amazing what you can do, I love it! I even replaced my NVIDIA RTX 2080Ti with an NVIDIA RTX 4090 to be able to circumvent the limited VRAM of the 2080Ti. But that's when I stopped teaching myself new things. I don't know anything about Pony, Illustrious, Qwen or anything like that, I'm just seeing your posts and I'm... wow!

And now there's Image-2-Video and Text-2-Video using Wan 2.1/2.2, which I'm very interested in, but I'm totally lost! So here I am asking you guys if you can help me out.

First, I've started using the official Wan templates like "Wan 2.2 14B Text to Video" or "Wan 2.2 14B Image to Video" which can be found in the "Templates" section of the ComfyUI menu. I've downloaded the models for it, and while those workflows do work, they seem to be slow and "limited" when it comes to the length and resulution of the video.. And because you'll usually want to create a couple of videos with the same prompt to pick the best result, this might take many hours, maybe even a couple of days to get the video you're looking for. And it's only like 5 seconds long...

Next I was looking for some "optimized" workflows people share online, and first I found
You must be registered to see the links
set of workflows on civit.ai, and I've been trying it out. I do like the "WAN 2.2 I2V" workflow included, because it seems to be faster and has more options, but I still feel limited to when it comes to the resolution and length of the video because it uses ".safetenors" models which uses a lot of VRAM. I can still get 5 seconds videos with a decent resolution, or I can get a longer video with a poor resolution.

Then I thought I might go for GGUF models instead, because from what I understand, they do use less VRAM, but they are "compressed" and therefore might take longer. I don't mind waiting a couple of minutes for results if I can use more frames (= longer videos) or a higher resolution than with the "default" workflow. So I found
You must be registered to see the links
, which is very impressive, uses GGUF, has a bunch of options, and after downloading all the missing nodes and models (as well as fixing a "bug" in the workflow itself) it's producing decent results within a couple of minutes. I've been able to create a few videos of 20+ seconds (at 24 FPS) with a resolution of 480x800, but as soon as I add action prompts for the camera or the subject in the picture (btw: no additional LoRAs are involved), the video gets blurry (looks like a double- or even multi-exposures when talking about photographs) or it just doesn't follow the prompt (i.e. if the prompt says "the camera slowly zooms in toward the woman's face", it zooms-in for about 3 seconds, then zooms back out and repeats those steps until the end of the clip -- even if I add something like "at second 5, the camera stops completely and remains entirely static for the rest of the video. there is no zooming, panning, or movement after this point — the frame stays locked on her face.")

So here are my question:

What's your overall workflow to create a 10-20+ second high-resolution video based on your imgination/prompt?

The resulting video should be produced in a couple of minutes (5-15 minutes at most, not hours).

What's your Text-2-Image workflow you use to create your starting image?

What's your Image-2-Video workflow to produce a 10-20+ second video with a decent (720p) resolution?

What's your workflow to upscale the video to a HD resolution (1280p or even 1440p)?

What prompt (or LoRA) do you use to consistently "control" the camera movements (zoom in, zoom out, keep being static at a close-up etc.)

Any help is highly appreciated. I would love to end up with with like 3-4 workflows in total (1: create a starting/ending image for the video / 2: create an at least 10-20+ second video with "precise" camera movement / 3: upscale the video to at least 1280p).

TL;DR: if you share your workflows to create a 20+ seconds video with precise camera (and subject) actions, or are able to point me into the right direction where to research further, I will be in your dept forever

For most purposes yourSDXL workflows will work with Illustrious with little problem, although I do still have a few SDXL checkpoints that won't work with myIllustrious workflows most do.

For videos just create on a low res, batch save the images, and then make a workflow to batch upscale and combine them later. It isn't the most efficient but with my 4080 it takes seconds to render lower res images and that would get the batch out of the way quite quickly, I assume a 4090 would upscale quicker than my 4080 does, and you could probably add in someaudio, although I haven't figured out audio yet myself.

modine2021 · Oct 30, 2025

upgrade to Win11 and decide to use comfy again(i been using Framepack Studio for vids). but Comfy won't load anymore. tried reinstalling, using diffeerent forks, 10 different browsers, etc etc, but yet still a looping blank page like this:

JhonLui · Oct 30, 2025

modine2021 said:
upgrade to Win11

boringapocalypse · Oct 30, 2025

modine2021 said:
"upgrade to Win11"

"upgrade"

The issue might be with the browser rather than the OS though, check the logs to see how far it gets in running.

modine2021 · Oct 30, 2025

boringapocalypse said:
"upgrade"

The issue might be with the browser rather than the OS though, check the logs to see how far it gets in running.

yes upgraded to win11. was holding out. was using it before didn't like it. was waiting for a few updates. just giving it another spin. far as browsers, i use every one i had. still that blank page

Sharinel · Oct 31, 2025

modine2021 said:
upgrade to Win11 and decide to use comfy again(i been using Framepack Studio for vids). but Comfy won't load anymore. tried reinstalling, using diffeerent forks, 10 different browsers, etc etc, but yet still a looping blank page like this:

What does the comfyui command window show? (I use Swarmui so mine will look something different). Also maybe try using localhost instead of 127. (a long shot I know). -

You must be registered to see the links

I must say though, 8188 seems a bit of a stramge one, is that comfy default?

Synalon · Oct 31, 2025

modine2021 said:
upgrade to Win11 and decide to use comfy again(i been using Framepack Studio for vids). but Comfy won't load anymore. tried reinstalling, using diffeerent forks, 10 different browsers, etc etc, but yet still a looping blank page like this:

Mine does this sometimes after adding a custom node, or will just randomly do it, but reloading comfy will fix it for me.

The official Comfy UI installer from their page nowseems more self contained and doesn't open up in a browser page.

These days I started using Cognibuilds loaded installer, it is more like the older version of comfy as far as the interface goes, and still runs in a browser. But it installs everything for sage attention and other stuff more or less automatically so I haven't had this as often as when I was manually installing everything into the official install myself.

Cognibuilds installer also updates existing installs ,if you use it for that you have to delete your venv folder, run his installer and then you will have to reinstall your custom nodes, as they will still be there but won't run unless you do. So thats an option as a last resort at least.

Synalon · Oct 31, 2025

Sharinel said:
What does the comfyui command window show? (I use Swarmui so mine will look something different). Also maybe try using localhost instead of 127. (a long shot I know). -
You must be registered to see the links

I must say though, 8188 seems a bit of a stramge one, is that comfy default?

You must be registered to see the links

is the address I use as well, with Cognibuilds installer.

modine2021 · Oct 31, 2025

Synalon said:
Mine does this sometimes after adding a custom node, or will just randomly do it, but reloading comfy will fix it for me.

The official Comfy UI installer from their page nowseems more self contained and doesn't open up in a browser page.

These days I started using Cognibuilds loaded installer, it is more like the older version of comfy as far as the interface goes, and still runs in a browser. But it installs everything for sage attention and other stuff moreor less automatically so I haven't had this as often as when I was manually installing everything into the official install myself.

Cognibuilds installer also updates existing installs,if you use it for that you have to delete your venv folder, run his installer and then you will have to reinstall your custom nodes, as they will still be there but won't run unless you do. So thats an option as a last resort at least.

since my initial post it loaded. come to find out its just taking a long time to load. over an hour. ugh!. no idea what it is. didnt happen on Win 10. i will try what u suggested.

[Stable Diffusion] Prompt Sharing and Learning Thread

Active Member

Well-Known Member

Well-Known Member

Engaged Member

Newbie

Well-Known Member

Well-Known Member

Well-Known Member

Newbie

Member

Active Member

Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Active Member

Member

Member

Well-Known Member