Hey Sepheyer, sorry to "ping" you, but I've just seen this post from you, and as I'm currently trying to get something like this working right now, I though I just asked if you're willing to share your workflow on how you'd achieve that?
So here's what I'm trying to achieve: take any DAZ rendered scene from a game and turn it into a more photo like style.
I've watched so many ComfyUI tutorials in the past couple of days, and I remember seeing one in which a node was used to generate a "positive prompt" for any loaded image (but of course I can't find that specific tutorial anymore ). Because I don't want to enter any prompt manually, I just want to load any rendered (or even drawn?) image, extract a "positive prompt" out of it and then feed it into a sampler using a photo realism model to basically generate a "real" copy of that image. I might need to add a "combined/concatenated" positive prompt in case I need to add some specifics, but I want to avoid that as much as possible.
I did see a lot of tutorials doing the quite opposite (turning a photo into a drawing or pencil sketch), but I haven't found a single one doing it the other way round. Well, I found one workflow on CivitAI that looked promising, but it was based on text-2-image and not image-2-image...
If you could share any details on how you would do that, that would be awesome!
Sure, here's my HS2-IRL babe converter. Ping me again in a week or so, it should become better still as this is massively WIP and has two or three bugs. Still, completely fit for purpose as of rhiinow.
And here's yet another question: It's quite easy and fun to replace faces in images, and I'm literally blown away how easy it was. I replaced some faces in memes with some friend's faces while streaming to them on Discord, and we had so much fun
But the method I used so far only replaces the actual face, not the hair. Is there any way to replace the face and the hair in an image, do I have to use the mask feature to achieve that?
Again, thanks a lot in advance to whomever has some advice. And sorry if those are still noob-ish like questions, I've searched through this thread (and other tutorials) but wasn't able to find a working solution...
But in theory. I just learned about it, so having my hopes up. I attached the workflow that selects the prompted item but I haven't gotten around to do swaps - it is rather the same across thoes three files.
Can you post your face swap w/f? It should inform me how you think - whether in pixel space or in latents.
Unfortunately I haven't saved my workflow, but I will try to recreate it and then post it. Thanks a lot for your workflows though! I will test them as soon as I got some more time
// Edit: I was able to separate the hair from the model using your workflow, but I'm also still trying to figure out how to replace it in the original image...
Going by timestamps on the first and last image, it took over 15 hours to generate all the images, so next time i'll try something shorter
So the basic idea was to take video/clip of someone "dancing", split that into frames and process those to get the poses. In this case that gave >1400 pose images, some had to be cleaned out due to bad capturing or other issues, leaving slightly below 1400.
Then i'd use those images to generate a character using those pose images and hope things worked out. Since the pose is all you really care about you can keep a fixed seed, but it still causes quite the variance in output. So trying to keep a simple background is a bit of a challenge. In hindsight i probably should have used a lora of a character that had a completely fixed look, including outfit, but as this was just intended as a concept/idea test it doesn't matter that much.
I'd intentionally set things up so that i had each pose as a separate file and to keep every generated image in the same numerical order and not depend on counters/batch processing in case things broke or i had to redo some specific image.
Looping through each pose is simple enough anyway and not loading everything into memory also helps with potential OOM issues.
To save time and to test LCM while i were at it, i used just a normal SD1.5 model with a LCM weight lora, so images were generated in just 5 steps, same with face restore. So in this case that lora did a fair job.
So after merging all the frames back together and some processing, i had a ~42sec 60fps clip of a AI woman roughly moving in the expected way and with some additional arm swinging and head warping due to not fixing enough of the pose images and prompt.
I can't post the file in full on forum due to size, adding a downscaled 24fps version and 2 full size images. There's odd jumps/cuts and movements due to frames having to be cut out in poses or bad generating. This wasn't a test of how perfect it would be, but "will it work", so i didn't bother fixing all those things. And tbh with 1400 poses and the same amount of images, i'd rather not go over all of them multiple times just for this type of test.
There are at least some sections that aren't too bad.
Credit to Mr-Fox for his Kendra lora, usually has a download link in his sig. Though can really say these images do her justice, but any PR is good PR right...
So cool. I'm very happy to see my lora being used. Those skips and jumps makes me think of old silent black and white movies Charlie Chaplin etc. You could say it's artistic choice and retro vintage style. There fixed..
There are many different workflows and tools that has been trending during the year with animation as focus and goal. There is a stand alone software called Ebsynth. If I understand it correctly you only need to generate key frames and the software does the rest.
This link is only for having an example, there might be much better guides and tutorial videos out there:
You must be registered to see the links
The software homepage:
You must be registered to see the links
An amazing img2img + ebsynth video I found. Fair warning it's pretty freaky:
You must be registered to see the links
A knowledge resource and guide of sorts with examples and different methods involving various software and extensions:
You must be registered to see the links
Another video example of a person dancing:
You must be registered to see the links
video2video guide on civitai:
You must be registered to see the links
I know that there are online communities that is dedicated to only animation and video making with stable diffusion but I could not find the link or remember where I saw it.
So i've been playing around with some ideas on how to keep the background (and potentially other elements) static for animations.
This does seem like an option IF i can work out how to keep the moving element in the correct position. IE having someone walk across the image would mean you'd have to know the exact location of the pose for each frame and so far i've not found a way to keep the poses alignment within the "pose image" as it gets lost when removing the background. Anyway one step at the time, how the AI does it too
Just thought i'd share these as an example of how you can layer things like you would do in things like Photoshop and that you're final image doesn't have to have "every pixel filled", you can have empty/transparent sections. You can use this to make things "poke out the side", break out of frame in comic book style, or make non-square images.
These images are using 3 different models, one sd1.5 for the "workshop" background, Cindy uses a different model sd1.5 with a lora and controlnet pose added in, and the sign at the bottom is a sdxl model just to have a hope that it would give readable text.
The workflow is an absolute mess, but it's basically stacking image composites.
I'll see if i can get some cleaner version, but it's not that complex to add to things, create image a, create image b, use composite to combine them.
Might be useful for ppl planing to do AI games to reuse backgrounds/rooms and position characters in, who knows...
Unfortunately I haven't saved my workflow, but I will try to recreate it and then post it. Thanks a lot for your workflows though! I will test them as soon as I got some more time
// Edit: I was able to separate the hair from the model using your workflow, but I'm also still trying to figure out how to replace it in the original image...
They key in that conversion workflow is the setting of the contolnet - how strong you want it and how many steps it should be applied for (between 0-100% of the steps). Both settings at maximum will get you "exact" replica, while with lower settings the CUY will use more of the text promp rather than control net and thus will "dream" more.
Here the same WF as here but with looser constrains on what the image should be:
Thanks again for your workflow -- I've tried it, but it's not doing what I want it to do: change a DAZ render into a "photo".
I've tried tuning the strength of the controlnet nodes, but it's doing all kinda stuff, because (I assume) you still have to add a lot of prompts manually on what you see in the original picture to get a somehow decent result.
Here's a quick idea on how I do want my workflow to look like:
I might need to add more nodes to the workflow to downscale/upscale/inpaint/outpaint the images etc., but you might get the basic idea of what I'm trying to achieve. The issue is the red node, which I know exists, but I don't know what it's called and which custom nodes I'll have to install to get that specific node.
Thanks again for your workflow -- I've tried it, but it's not doing what I want it to do: change a DAZ render into a "photo".
I've tried tuning the strength of the controlnet nodes, but it's doing all kinda stuff, because (I assume) you still have to add a lot of prompts manually on what you see in the original picture to get a somehow decent result.
I might need to add more nodes to the workflow to downscale/upscale/inpaint/outpaint the images etc., but you might get the basic idea of what I'm trying to achieve. The issue is the red node, which I know exists, but I don't know what it's called and which custom nodes I'll have to install to get that specific node.
Any chance you can attach the file as an image rather than keeping it as thumbnail (or pastebin the image please)? I keep getting the tiny-tiny thumbnail for some reasons, even though for a second it opens up fullscreen normal resolution. Prolly I messed up my browser somehow with recent updates and dreading the idea I'll have to reinstall it.
Meanwhile, here is the source cartoon that I used to convert into those photos I posted above, so in theory this WF should handle DAZ.
Also, for establishing context - does this image look realistic to you? Because this is honestly what I call photo / realistic. If you prefer more, than sorry, I got nothing for the time being.
Huh? You should be able to click on the thumbnail to get the full picture, and then you should be able to download it... But nevermind, I'll edited my previous post to attach the image instead of the thumbnail.
Yes, it does. But maybe this will help as well: someone posted pictures in the The Genesis Order thread, here are just two examples (first the original picture from the game, and then the "real" version of the same picture):
As you can see, the main picture composition (like the background, pose, lightning etc.) is still pretty much unchanged, it just changed it to a more "realistic" picture. This is pretty much what I'm trying to achieve
Thanks again for your workflow -- I've tried it, but it's not doing what I want it to do: change a DAZ render into a "photo".
I've tried tuning the strength of the controlnet nodes, but it's doing all kinda stuff, because (I assume) you still have to add a lot of prompts manually on what you see in the original picture to get a somehow decent result.
I might need to add more nodes to the workflow to downscale/upscale/inpaint/outpaint the images etc., but you might get the basic idea of what I'm trying to achieve. The issue is the red node, which I know exists, but I don't know what it's called and which custom nodes I'll have to install to get that specific node.
Here's the workflow and the image it converts DAZ to.
The lotsa things can be changed with the prompt - mifiness, clothes, etc. BTW, the WF looks a tad different - results of today's efforts. Finally, enable the "styles" group - for lols if nothin else.
I'll run the WF for a few more iterations and will add more images to this post if I come across any decent ones.
That's pretty decent, I like it! But looking at your workflow, you did change your prompts quite a bit compared to your previous workflow, and I'm trying to avoid that. I really, really need to find that node that auto-generates prompts from any given image -- I guess I'll have to go through my YT and browser history trying to find that thing...
Thanks again for your workflow -- I've tried it, but it's not doing what I want it to do: change a DAZ render into a "photo".
I've tried tuning the strength of the controlnet nodes, but it's doing all kinda stuff, because (I assume) you still have to add a lot of prompts manually on what you see in the original picture to get a somehow decent result.
I might need to add more nodes to the workflow to downscale/upscale/inpaint/outpaint the images etc., but you might get the basic idea of what I'm trying to achieve. The issue is the red node, which I know exists, but I don't know what it's called and which custom nodes I'll have to install to get that specific node.
You're looking for something like a interogation node. That'll analyze the image for you and create something you can use as prompt. Double click on the background in Comfyui and search for wd14. You might need to install a nodepack called Comfyui WD 1.4 Tagger, should be easy to find in the manager.
You're looking for something like a interogation node. That'll analyze the image for you and create something you can use as prompt. Double click on the background in Comfyui and search for wd14. You might need to install a nodepack called Comfyui WD 1.4 Tagger, should be easy to find in the manager.
Quick question: Is there a way to unload stuff (like models, loras etc.) from your running queue so it doesn't consume memory anymore? Or is that something that happens automatically?
I'm trying to replace a face in a video (still have to find a way to replace the hair as well, but that's something I'm still working on...). I'm loading an image of a face, and using a photorealistic model/checkpoint + conditions + a sampler to make it more appealing and matching to the video. When done, I'm using the resulting face as input for "ReActor" to replace a face in the video (about ~1:25 minutes, about 2.1k frames at 25 fps). But I'm getting an out-of-memory error mid-way through the queue, and I was wondering if the model I loaded to redefine the face is still loaded and could be somehow "unloaded" to free up some memory (as it's not needed anymore to remodel the face for every single frame -- it's a once-per-run step).
I'm not sure if that would solve the problem, and while typing this, I could probably test it by doing this in two separate workflows: first create the face I want to use as a replacement, and then in another workflow use that (saved) face to replace the face in the video... But anyways, the question stands: can I somehow free up memory during the process, does this happen automatically, or do I need to split my single workflow in multiple workflows?
I've shorten the video to ~5 seconds / ~125 frames to test it, and my "single workflow" works perfectly fine for that. It's just longer videos that give me that out-of-memory error.
// EDIT: Btw -- what GPUs are you guys using, and how much memory do you have?
I'm currently using a GTX 2080Ti (12 gig of VRAM) on my main PC with 32 gig memory, and I'm still trying to figure out if/how I can use my 2nd PC (GTX 1080 with 8 gig of VRAM and 32 gig of system memory) over the network...
I had a go at replicating "render to real" that i had in a1111, and tbh i think i got something very wrong with a bunch of controlnet nodes...anyway
This is done just with prompt from wd14, i added a negative prompt to "disallow" renders/cgi etc, but that it. Rest is just AI doing a bunch of stuff, i'd hoped to make it a simple "drop an image and hit go", but unfortunately it seems things are far more touchy than hoped. Strengths seem to be very impacted by model so atm it's gonna need quite a bit of tweaking. Different models put different things on the shirt, but they all agree it has to be green for some annoying reason and at one point the AI decided that grass was a bad thing and turned it all to concrete
Seems there's a long road to go still...