[Stable Diffusion] Prompt Sharing and Learning Thread

osanaiko

Engaged Member
Modder
Jul 4, 2017
2,687
4,987
Appreciate the feedback. Thank you. When I have some free time, i'll give a better go at the prompts.
Here is what I got with JuggernaughtXL model and this prompt, basically yours with periods->commas and a couple of extra terms:

"Tifa, final fantasy tifa, gigantic breasts, naked, At pool, During the day, Tifa is wet, dripping, sweating, Hair wet, resort, sunbathing people in background, 3d anime style", using Euler A, 20 Steps

00012-1173108244.png
 
  • Like
Reactions: Sepheyer

razfaz

Member
Mar 24, 2021
200
212
Hello Thread/People,

[Question]
- Why do only a few people use 3D scenes with humanoid 3D models to create their own stable diffusion datasets to train their own models?
 
Last edited:

osanaiko

Engaged Member
Modder
Jul 4, 2017
2,687
4,987
As far as I understand:

There are a number of "diffusion models" created by commercial entities costing $x millions of gpu time. Stability AI released Stable Diffusion for "free" and all the hobbyist stuff started from there.

Techniques were found to "Finetune" a model, building on an existing base Diffusion model to specialize it's output or improve some aspects. This requires a significant effort in image curation and labeling, and then still needs large hardware investment (80 hours of 8 x A100 for example).

Then there is LORA creation (which to me is still somewhat mysterious) that allows an additional focusing of end stage diffusiuon model output. this requires a few hundred labelled images and can theoretically be done with a beefy home setup (e.g. 4x3090).

- Why do only a few people use 3D scenes with humanoid 3D models to create their own stable diffusion datasets to train their own models?
I guess because it's a lot of effort to get the training images sorted out, and you need significant hardware / time to learn how to setup and run the training using cloud GPU resources.

And for what benefit? You'll get a LORA that can make your e.g. PonyXL output look fairly close to a specific Daz rendered character. But you'll still have all the normal diffusion problems of hands / faces / everything looks weird / inconsistent clothing and backgrounds. It will take plenty of time and post-work to get good results from your model.

If you have the skills to make a good set of training images with e.g., Daz, you are probably good enouigh to just use that process for ALL your images. Even put them through a light-touch diffusion img2img process to get the AI look if thats what you want.
 
  • Like
Reactions: razfaz

razfaz

Member
Mar 24, 2021
200
212
Yeah exactly, thats why I have asked. Finetuning is fine but has the issues You have mentioned.
I guess we have to live with the current pipeline till training becomes cheeper.
 
Last edited:
  • Like
Reactions: osanaiko

Sepheyer

Well-Known Member
Dec 21, 2020
1,605
3,838
Here's Alibaba's AI chat/image generator Qwen - supposedly the very most bestest as of today:



Requires a sign up, I don't have the energy rn, so no idea if it is free to use, as advertised.
 

osanaiko

Engaged Member
Modder
Jul 4, 2017
2,687
4,987
Here's Alibaba's AI chat/image generator Qwen - supposedly the very most bestest as of today:



Requires a sign up, I don't have the energy rn, so no idea if it is free to use, as advertised.
Funnily enough it refuses to let me make the image I really wanted....

1738227849852.png

1738228051711.png
 
  • Like
Reactions: Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,605
3,838
Funnily enough it refuses to let me make the image I really wanted....

View attachment 4497026

View attachment 4497027
Oh, yea, there is bunch of taboo no go zones: can't draw Xi, and anything disputed from China's recent history. Kinda like asking ChatGPT if one can be proud of being white or why three buildings fell while there "were" only two planes.

If you want tanks on highway - try the Abrams tanks on an Iraqi highway.
 
  • Haha
Reactions: osanaiko