[Stable Diffusion] Prompt Sharing and Learning Thread

me3

Member
Dec 31, 2016
316
708
We did get your point. What I pointed out was that you are treating as a technological limitation or problem with the process, what in reality are just creative choices made by Sepheyer, and they were very clear on what those choices were. Stable Diffusion doesn't have a problem making those images look more similar to the Honey Select source images, it's actually perfectly capable of achieving that, but they didn't aim to make those images as similar as you would like.

You can't just point out what you don't like about their images and complain that it doesn't fit what you would like to see happen. If there is something you'd like to accomplish you should give it a try yourself, and if you're having issues reaching specific goals we can help you get there. But pointing out "problems" on other people's images because they don't fit your goals is pretty pointless, because they are not trying to achieve what you want to achieve.
No you DIDN'T get my point, you missed it by miles, which you further illustrated with this post.
I don't have a problem with Sepheyer's images or choices, my point isn't about that at all, which is again why i've repeatedly stated you missed it and why i said it i wouldn't waste ppl time with it.
But you wrongfully accusing me of having a problem with Sepheyer's work makes me have to bring things up again.
If you'd bothered checking you'd see that i'd actually liked the post and i don't go around doing that unintentionally ;)
 
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,582
3,806
This seems to have the same "issues" as other "convert to real" methods. It converts the pose, background is fairly alike and the character is wearing relatively the same clothing, hair etc. However the face isn't really the all that close.
The face your render character has isn't in any way a "unrealistic anime" shape or features, yet when you look at the "real" version it hasn't even kept the basic shape of the face. The face is more rounded and "shorter", chin is different, eyes, lips, some of this could be prompting related sure, but AI is mean to be good at reading faces (scanners/cameras etc), but for things like this is doesn't seem to keep even the proportions "correct", which is exactly what is used for comparing faces.
If you overlay the original cartoon and the photorealistic ... mmm... what would be a good name to call the output of the workflow? re-print? aight, so the photorealistic re-print, they match as well as the fingerprints do. Naturally, not all re-prints give good fits, but I am certain there is a certain hit ratio that can be improved on.

ezgif-1-266359ac9b.gif
 
Last edited:
  • Like
Reactions: Mr-Fox

hkennereth

Member
Mar 3, 2019
239
784
No you DIDN'T get my point, you missed it by miles, which you further illustrated with this post.
I don't have a problem with Sepheyer's images or choices, my point isn't about that at all, which is again why i've repeatedly stated you missed it and why i said it i wouldn't waste ppl time with it.
But you wrongfully accusing me of having a problem with Sepheyer's work makes me have to bring things up again.
If you'd bothered checking you'd see that i'd actually liked the post and i don't go around doing that unintentionally ;)
Sure, because there is clearly some hidden meaning in the sentence "the face isn't really the all that close" that isn't addressed by everything we said... and that sentence also doesn't show that even if you like the images, you DO have ONE issue with them. That isn't an accusation, I just pointing out your own words. :rolleyes:

If you mean something other than what you said, please, do feel free to clarify, but we can only take you on your words. And your words were, and I am indeed paraphrasing from the quoted sentence above and the remaining of your post, that you think "the processed images don't look enough like the source ones", which was thoroughly explained as to WHY that happens.

I'm not trying to be antagonistic (though perhaps a bit frustrated), I would in fact love to assist you get the results you seem to want because I have been doing this for about as long as the tech has been available, but I can't do that unless you are clear in what you want, and ask questions without pre-assumptions such as the ones we discussed above. Saying "you just don't get me" doesn't help anyone involved.
 
  • Like
Reactions: Mr-Fox and Sepheyer

Sepheyer

Well-Known Member
Dec 21, 2020
1,582
3,806
[Long post warning]
The confusing and "laughable" nature of training guides.

Having read a ever increasing large number of "guides" and post/comments on guides regarding picking images, what optimizers to use, captioning, learning rates and other settings the only things i've really learned is hundreds of things that doesn't work (in a large number of situations) and that most of these guides/instructions are just pointless.

I can't really decide if much of this is down to general ignorance and ppl simply have no idea what's going on or if it's intentional. I've seen many cases where ppl claim that this and that works amazingly and they can use the same setup for everything with perfect results. They then link to something mean to show off this amazing work and very often shows loras/etc of some known character and it looks nothing like them. Other times there is an actual likeness so you might think that this setup actually does work so you give it ago, after all it's meant to work for everything. Big shock, it doesn't seem to work. So you download one of the loras and check the meta data, just to confirm the settings.
This generally has a few possible outcomes:
  1. Settings do match, fair enough as there's loads of outside variables affecting things (more on that later)
  2. They don't match, often a lot different and they don't even match other loras from the same person
  3. Meta data is missing, this is generally something intentionally done through editing or extracting the lora from something else. Both strongly suggesting they used other training methods. Dreambooth being a common explanation.
#2 can be potentially be explained by "evolving tools" or having gained new insight into this, but shouldn't that also been updated in your guide?

If you compare guides you also come across conflicting claims as well. IE you have two guides that have the same basic raters, optimizers, number of image recommendations etc, yet one of them say it should take ~1000 steps to be perfect and the other says 4-5000. Assuming one of them is correct, then one will be either very undertrained or overtrained...both can't really be right :(

A lot of guides are posted places where ppl can in some way give feedback etc, often with improvements/suggestions themselves, or "corrections". Great since this means there's more data to work with, what does get a bit suspicious though is when the authors respond to "praise" and chances to advertise things they can "profit" (ie youtube videos etc), but they completely ignore issues raised. Adding this things together you almost get the impression that there's ppl intentionally posting things that is misleading or lacking to make ppl fail and/or repeatedly having to review the instructions while their own work gets "propped up" and they profit in terms of downloads and views...hmmmm, nah ppl can't be that petty and self centered right...

Since i figured out how i could actually do training on my very limited setup i've been trying to find some fairly basic and consistent way to get ppl at least a fair bit on the way. BUT as i mentioned early on, guides seem pointless, so i guess instead i'd rant about it and ppl can just skip over it....

HOWEVER, what might be more useful is to know things that can be screwing things up so ppl don't needlessly waste months trying to figure things out and spend hours upon hours burning out their GPU...

  1. There seems to be an issue with kohya_ss and training SD1.5. Exactly when it started seems to be a bit uncertain since the last working version for some is in April, other in June. Personally the last version i've gotten to work and that trains fairly well is from June. There seems to be some disagreement about what is causing it too. Some claim it's related to newer versions of bitsandbytes, but i've updated that on my working version and no real difference. The issue also affects other optimizers so it can be the only cause. But it's worth considering if you got issues training. Latest version also seems to have broken SDXL, but that might be fixed fairly fast since XL seems to be the main priority.
  2. If you do follow a guide, keep in mind that not only does your images and captions make a difference, but also the versions of the tools you are using. That includes that different libs they are dependent on.
  3. Relating to #1 and #2, if you have something that works for you, be VERY careful about updating. Yes you can just go back to a previous commit, but then you'll have to keep a close eye on requirements too.
  4. Regarding updating/downgrading, it might seem as simple as just running the setups/requirements install again and it's good to go, but it seems that's not always the case. Just as recently as last night i decided to update a single dependency for kohya_ss. Start it up, things get downloaded, says it's updated...and no change...hmmm, i manually do the install, pip says it's already installed, check the version in the correct folder and it seems to match. Start again and nope, still no change...Forced reinstall and still no change. Delete the lib in question, install and finally it's actually working.
    So despite things seemingly being updated for you, it might not really be working as it should be, so it might be worth while clearing out some/all of those python libs once in a while. Folder in question is generally ./venv/ etc, it might cause you some downloading and waiting while it all reinstalls so in some cases it might be enough to just delete the lib in question.
Small tip for anyone that made it to the end, just because you're training one concept, it doesn't mean you need to keep all images in the same folder and give everything the same priority. Splitting them into different folders with different repeats might be useful at times ;)

Random image added to catch some ppls eyes, don't really have a stop sign or light to make ppl have to stop and wait so she'll have to do.
And yes she's looking at you...
View attachment 3015973

(Edited because formatting broke :( )
a_01677_.png

Dude, so:
- sunscreens do cause cancers rather than protect from cancers
- eating more often is highly recommended because the portions are smaller, but in fact it gives insulin resistance and leads to what doctors diagnose as type 2diabetes
- organic food is not organic at all
- alcohol is carcinogenic even though "they" say it ain't
- central banks don't fight inflation, they cause it
- the elected government is a mere front, the actual policies are made by people who are never elected
- jet fuel doesn't melt steel beams
- Yellen just said we can afford 2 wars even though USA can't
- there are no grapes at all in the Costo's $10 "wine"
- gold is not a pet rock
- [self-censoring myself on another 20 topics I could bring up off the top off my head]

and you are worked up today cause some idiots somewhere post false guides? I mean yeaa, that's the planet the world we live in.
:cool:

---
Edit: added the pic, since fuck long posts without pics.
 
Last edited:

me3

Member
Dec 31, 2016
316
708
Dude, so:
- sunscreens do cause cancers rather than protect from cancers
- eating more often is highly recommended because the portions are smaller, but in fact it gives insulin resistance and leads to what doctors diagnose as type 2diabetes
- organic food is not organic at all
- alcohol is carcinogenic even though "they" say it ain't
- central banks don't fight inflation, they cause it
- the elected government is a mere front, the actual policies are made by people who are never elected
- jet fuel doesn't melt steel beams
- Yellen just said we can afford 2 wars even though USA can't
- there are no grapes at all in the Costo's $10 "wine"
- gold is not a pet rock
- [self-censoring myself on another 20 topics I could bring up off the top off my head]

and you are worked up today cause some idiots somewhere post false guides? I mean yeaa, that's the planet the world we live in.
:cool:
No, i were actually trying to help ppl into not going around looking at guide after guide after guide, but i just won't bother
 
  • Like
Reactions: Mr-Fox

me3

Member
Dec 31, 2016
316
708
Sure, because there is clearly some hidden meaning in the sentence "the face isn't really the all that close" that isn't addressed by everything we said... and that sentence also doesn't show that even if you like the images, you DO have ONE issue with them. That isn't an accusation, I just pointing out your own words. :rolleyes:

If you mean something other than what you said, please, do feel free to clarify, but we can only take you on your words. And your words were, and I am indeed paraphrasing from the quoted sentence above and the remaining of your post, that you think "the processed images don't look enough like the source ones", which was thoroughly explained as to WHY that happens.

I'm not trying to be antagonistic (though perhaps a bit frustrated), I would in fact love to assist you get the results you seem to want because I have been doing this for about as long as the tech has been available, but I can't do that unless you are clear in what you want, and ask questions without pre-assumptions such as the ones we discussed above. Saying "you just don't get me" doesn't help anyone involved.
Having asked how to try and explain this i have no idea if it'll help or fuel the fire...
I'd start by pointing out that "this" is not something i consider a positive, if anything i'd file it firmly in the category of failings on my part.

So to try and illustrate.
You know those black and white (generally) images that has some kind of dual imagery, some ppl can see both, some see just one or the other.
Or the "noise" images that's meant to "pop out" some kind of 3d image if you look at it for a while, some ppl see it, some don't.
Or when you see some kind of shape in clouds etc (pareidolia) and you point them out and others don't see it.

This is "how" i know you miss the point, you saw the words, but not what ever "it" is. And tbh you're probably better off for it as it's fucking annoying.
As i see this in a lot of AI images and only AI images. If you've ever looked at something or someone at some point and you've known "something" was off, something was different/wrong/whatever, but you can figure out "what", there's just something and you then spend the rest of the day/week failing to figure out wtf it is. Now imagine this happening going on with AI images and how often you're likely to be dealing with those when generating...It's specially happens when there's some kind of comparison involved, source image compared to generation.
Comparison grids can be a nightmare.

When i post images it isn't "look at the pretty thing i made, bask in my awesomeness", i'm more wondering if ppl notice something or if they actually find it good as it gives me an idea of "where the field is". Cause i find more than enough faults with them. Probably why i prefer to post things for a laugh, ie the Selena corgi...
So regarding the images you claim i had issues with, the person posting them found them good enough that they past their selection for posting and when i see things about them that i like, it means that what ever the "wrong" i might see is something else, so it's "comforting" (for a lack of a better term).

This probably won't help much and it's very much turning out to something between horribly narcissistic sounding and short bus to a psych ward
 
  • Like
Reactions: Mr-Fox

Sepheyer

Well-Known Member
Dec 21, 2020
1,582
3,806
Having asked how to try and explain this i have no idea if it'll help or fuel the fire...
I'd start by pointing out that "this" is not something i consider a positive, if anything i'd file it firmly in the category of failings on my part.

So to try and illustrate.
You know those black and white (generally) images that has some kind of dual imagery, some ppl can see both, some see just one or the other.
Or the "noise" images that's meant to "pop out" some kind of 3d image if you look at it for a while, some ppl see it, some don't.
Or when you see some kind of shape in clouds etc (pareidolia) and you point them out and others don't see it.

This is "how" i know you miss the point, you saw the words, but not what ever "it" is. And tbh you're probably better off for it as it's fucking annoying.
As i see this in a lot of AI images and only AI images. If you've ever looked at something or someone at some point and you've known "something" was off, something was different/wrong/whatever, but you can figure out "what", there's just something and you then spend the rest of the day/week failing to figure out wtf it is. Now imagine this happening going on with AI images and how often you're likely to be dealing with those when generating...It's specially happens when there's some kind of comparison involved, source image compared to generation.
Comparison grids can be a nightmare.

When i post images it isn't "look at the pretty thing i made, bask in my awesomeness", i'm more wondering if ppl notice something or if they actually find it good as it gives me an idea of "where the field is". Cause i find more than enough faults with them. Probably why i prefer to post things for a laugh, ie the Selena corgi...
So regarding the images you claim i had issues with, the person posting them found them good enough that they past their selection for posting and when i see things about them that i like, it means that what ever the "wrong" i might see is something else, so it's "comforting" (for a lack of a better term).

This probably won't help much and it's very much turning out to something between horribly narcissistic sounding and short bus to a psych ward
a_01692_.png

Bruh, you keep talking to folks in a patronizing manner, like only you can do something and others can't. No, we both got your point and we are not buying in. This happens, the evidence you submitted contains no sufficient proof for the point being discussed. Why do you think others are inferior to you and can't pick up on things that you see? My entire life I have people around me who are smarter, have more experience, and it is normal for me to doubt myself when making a claim; it would never occur to me to go: oh, you guys just don't get me.

I am pretty sure I got your point. I can pick up on a lot of things because of the path the life took me on: I have a page on IMDB where I am the director of photography for a few student films. I have a great personal collection of nude art that I shot personally with professional models. I retired at 3X cause I see stock patterns coil up and was lucky to nail a few. The Italian marble in my restroom drives me horny nonstop because all I see are gorgeous nude female figures. Now, I'll give you the real flex - I could see the patterns in those "stereo cards" you referred to since I turned six. I.e. this card here has 0 in it as the "content":
You don't have permission to view the spoiler content. Log in or register now.
Better yet, I can teach anyone willing how to see these cards (provided they are willing to learn), it is purely mechanical.

Now, don't be like: oh you poor things, you don't get me. This is playground level communication that none here asked for. You are literally being patronizing / condescending for no reason at all and I attribute it to you not having a good day.
---
Edit: I am kidding I made all that shit up and I am actually a moron. Also added that blondie to the post.
 
Last edited:

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,805
Having asked how to try and explain this i have no idea if it'll help or fuel the fire...
I'd start by pointing out that "this" is not something i consider a positive, if anything i'd file it firmly in the category of failings on my part.

So to try and illustrate.
You know those black and white (generally) images that has some kind of dual imagery, some ppl can see both, some see just one or the other.
Or the "noise" images that's meant to "pop out" some kind of 3d image if you look at it for a while, some ppl see it, some don't.
Or when you see some kind of shape in clouds etc (pareidolia) and you point them out and others don't see it.

This is "how" i know you miss the point, you saw the words, but not what ever "it" is. And tbh you're probably better off for it as it's fucking annoying.
As i see this in a lot of AI images and only AI images. If you've ever looked at something or someone at some point and you've known "something" was off, something was different/wrong/whatever, but you can figure out "what", there's just something and you then spend the rest of the day/week failing to figure out wtf it is. Now imagine this happening going on with AI images and how often you're likely to be dealing with those when generating...It's specially happens when there's some kind of comparison involved, source image compared to generation.
Comparison grids can be a nightmare.

When i post images it isn't "look at the pretty thing i made, bask in my awesomeness", i'm more wondering if ppl notice something or if they actually find it good as it gives me an idea of "where the field is". Cause i find more than enough faults with them. Probably why i prefer to post things for a laugh, ie the Selena corgi...
So regarding the images you claim i had issues with, the person posting them found them good enough that they past their selection for posting and when i see things about them that i like, it means that what ever the "wrong" i might see is something else, so it's "comforting" (for a lack of a better term).

This probably won't help much and it's very much turning out to something between horribly narcissistic sounding and short bus to a psych ward
Rorschach tests,Visual illusions and the uncanny valley, I assume. I'm very familiar with these, makes sense. Fine gentlemen, we're all friends here. It's ok to have a difference in opinion and it's ok to disagree. An intense discussion but civil, can be very stimulating and something good can actually come out of it. Just remember that we are indeed not in competition with each other or enemies. So lets try our best to have our discussions in good faith and if something is unclear, lets ask instead of assuming. I for one loves the collaborative nature of this thread.
 

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,805
OK, test completed.
It's looking to me like, for photorealistic images of young ladies, the best is either 4xNMKDSuperScale at about 0.25 Denoising or, as the fantastic Mr-Fox recommended, 8x_NMKDFacesExtended_100000_G* at 0.25 or 0.3

You may feel differently, the gains are subtle.

Here's the X/Y plot in full resolution:



*Just trips off the tongue, don't it? :rolleyes:
Awesome work Jim.(y) Yes the name really does trip on the tongue rather than roll..:p Since NMKD has many different versions of the same "model" of upscaler I thought it best I included the full name to avoid confusion. The difference in the test is subtle but it becomes more clear with higher resolution and at the later stage if one choose to upscale further in img2img with SD Upscale. I find that the fine details in the eyes iris and similar details gets lost with too high denoising strength. For instance skin texture or hair strands. What I have learned so far is that you want to get as much fine detail as early in the process as possible and then you need to preserve them in the steps after, as much as possible. A bad upscale can completely ruin a good image while a good one can take it to the next level.
At some point diminishing returns will become a factor and eventually there is a limit to the current technology. It's a lot of fun to push the envelope though and see how far we can stretch those limits.
 

hkennereth

Member
Mar 3, 2019
239
784
Alright, you can notice that some AI images have something odd about them. Congrats, but so does everybody.

I'll just focus on this one paragraph below because... honestly, you wrote a lot but said very little, and this is honestly my last attempt at addressing this particular subject.

So regarding the images you claim i had issues with, the person posting them found them good enough that they past their selection for posting and when i see things about them that i like, it means that what ever the "wrong" i might see is something else, so it's "comforting" (for a lack of a better term).
Buddy, I didn't claim you had issues with some images. You can tell us that within your heart of hearts you loved them all along but you, using your own words, described in detail the issues you had with those images, which is what we have been focusing all along. It's all right there on your first reply:

This seems to have the same "issues" as other "convert to real" methods. It converts the pose, background is fairly alike and the character is wearing relatively the same clothing, hair etc. However the face isn't really the all that close.
The face your render character has isn't in any way a "unrealistic anime" shape or features, yet when you look at the "real" version it hasn't even kept the basic shape of the face. The face is more rounded and "shorter", chin is different, eyes, lips, some of this could be prompting related sure, but AI is mean to be good at reading faces (scanners/cameras etc), but for things like this is doesn't seem to keep even the proportions "correct", which is exactly what is used for comparing faces.
However in the end you seem to be the one missing the point: the images that Sepheyer posted weren't shared because to show off their amazing skills, or because they thought or claimed they were perfect and flawless; they were shared because this is, and I quote, a "prompt sharing and learning thread". It was to share a technique that worked at some level for them.

Were the results perfect? There's no such thing. Were they bragging? Nope. Did they claim those specific images passed some specific quality criteria? No, they were just decent demonstrations of the specific technique. Are they end-all solutions for the problem they tackle? Again, no such thing exist. The whole point of this entire exercise, from their original post to every reply we offered was because we were hoping to help you perhaps learn something about image creation with AI, because most of the issues you described can be solved or at least minimized, when addressing them is one's focus. Again, they were clearly not the focus of the original images, hence the replies you received pointing out that the "issues" you listed were irrelevant.

Now, instead of describing your super powers of visual perception, you could have saved us time all along had you just said that you didn't really want to learn anything, you were just offering unconstructive criticism, that way we would have not bothered trying to explain why they were not really relevant on that instance. I'm assuming that's what you meant with your entire post, but maybe I'm once again missing the subtext behind whatever it was you said there. I do that a lot, it seems, but then again I'm more of a "say what you mean and mean what you say" kind of guy...
 
Last edited:
  • Like
Reactions: DD3DD and Sepheyer

Jimwalrus

Well-Known Member
Sep 15, 2021
1,059
4,045
Awesome work Jim.(y) Yes the name really does trip on the tongue rather than roll..:p Since NMKD has many different versions of the same "model" of upscaler I thought it best I included the full name to avoid confusion. The difference in the test is subtle but it becomes more clear with higher resolution and at the later stage if one choose to upscale further in img2img with SD Upscale. I find that the fine details in the eyes iris and similar details gets lost with too high denoising strength. For instance skin texture or hair strands. What I have learned so far is that you want to get as much fine detail as early in the process as possible and then you need to preserve them in the steps after, as much as possible. A bad upscale can completely ruin a good image while a good one can take it to the next level.
At some point diminishing returns will become a factor and eventually there is a limit to the current technology. It's a lot of fun to push the envelope though and see how far we can stretch those limits.
I have noticed however that 8xNMKDFacesExtended takes about 0.6s longer per iteration the ESRGAN_4x on my 3060, adding a little to the generation time (~36s for 60 HiRes steps).
I don't think though that enough time would be saved across normal batch quantities to consider using a quicker Upscaler for the initial batch, then re-running the best with a better one. Batch size would have to be in excess of 40 or so if you include the time to reset before the rerun.
 

hkennereth

Member
Mar 3, 2019
239
784
Awesome work Jim.(y) Yes the name really does trip on the tongue rather than roll..:p Since NMKD has many different versions of the same "model" of upscaler I thought it best I included the full name to avoid confusion. The difference in the test is subtle but it becomes more clear with higher resolution and at the later stage if one choose to upscale further in img2img with SD Upscale. I find that the fine details in the eyes iris and similar details gets lost with too high denoising strength. For instance skin texture or hair strands. What I have learned so far is that you want to get as much fine detail as early in the process as possible and then you need to preserve them in the steps after, as much as possible. A bad upscale can completely ruin a good image while a good one can take it to the next level.
At some point diminishing returns will become a factor and eventually there is a limit to the current technology. It's a lot of fun to push the envelope though and see how far we can stretch those limits.
I completely agree. NMKD's upscalers are really great for some things, but sometimes can fail at some different scenarios. I haven't tried this particular one yet, but so far the only one I was really happy with as a good all-arounder was Siax. But even that I noticed can introduce some noise in some edge cases, so I always keep 4x-UltraSharp close by because I find that it always gives decent results no matter the case.
 

me3

Member
Dec 31, 2016
316
708
Awesome work Jim.(y) Yes the name really does trip on the tongue rather than roll..:p Since NMKD has many different versions of the same "model" of upscaler I thought it best I included the full name to avoid confusion. The difference in the test is subtle but it becomes more clear with higher resolution and at the later stage if one choose to upscale further in img2img with SD Upscale. I find that the fine details in the eyes iris and similar details gets lost with too high denoising strength. For instance skin texture or hair strands. What I have learned so far is that you want to get as much fine detail as early in the process as possible and then you need to preserve them in the steps after, as much as possible. A bad upscale can completely ruin a good image while a good one can take it to the next level.
At some point diminishing returns will become a factor and eventually there is a limit to the current technology. It's a lot of fun to push the envelope though and see how far we can stretch those limits.
I have noticed however that 8xNMKDFacesExtended takes about 0.6s longer per iteration the ESRGAN_4x on my 3060, adding a little to the generation time (~36s for 60 HiRes steps).
I don't think though that enough time would be saved across normal batch quantities to consider using a quicker Upscaler for the initial batch, then re-running the best with a better one. Batch size would have to be in excess of 40 or so if you include the time to reset before the rerun.
Given the 2x, 4x and 8x naming pattern used, is this in reference to a recommended upscaling amount?
In a 2x upscaling, would a 4x version would be better suited than 8x?

I can't run large highres upscalings to test myself so i've not done much with it
 

hkennereth

Member
Mar 3, 2019
239
784
Given the 2x, 4x and 8x naming pattern used, is this in reference to a recommended upscaling amount?
In a 2x upscaling, would a 4x version would be better suited than 8x?

I can't run large highres upscalings to test myself so i've not done much with it
That's correct. If you run the upscaler without sizing parameters (I'm not sure that's possible on A1111, but it's the default behavior on ComfyUI), that value is how much the image will be upscaled.
 

me3

Member
Dec 31, 2016
316
708
Since there was some new upscalers mentioned i wanted to test and it gave a nice reminder that highres testing is not for me :(
2x upscaling a 512x512 image took just 3min, a 768x768 took 13min, and it overflowed into shared memory by 11gb on the final step/percentage generating.
3x upscaling on 512x512 took 12,5 min...so yea, not really sure i'm gonna bother with anything higher.
Seems it was a long wait for more than just me, even the woman in the image went old and grey
00002-2051824061.png
On a more serious note, look a her hair about a bit below her chin, at first i thought something was wrong with the upscaling, but i've included a non upscaled version i generated afterwards and it's there too.
Third image is the 2x upscale, different seed and the same type of "framing" is there too, on the left side.

00004-2051824061.png 00000-2518891223.png

Fault in model data?

(Edit: added a enlarged section incase it's difficult to see on the forum)
resized.png
 
Last edited:

hkennereth

Member
Mar 3, 2019
239
784
Since there was some new upscalers mentioned i wanted to test and it gave a nice reminder that highres testing is not for me :(
2x upscaling a 512x512 image took just 3min, a 768x768 took 13min, and it overflowed into shared memory by 11gb on the final step/percentage generating.
3x upscaling on 512x512 took 12,5 min...so yea, not really sure i'm gonna bother with anything higher.
Seems it was a long wait for more than just me, even the woman in the image went old and grey
View attachment 3016726
On a more serious note, look a her hair about a bit below her chin, at first i thought something was wrong with the upscaling, but i've included a non upscaled version i generated afterwards and it's there too.
Third image is the 2x upscale, different seed and the same type of "framing" is there too, on the left side.

View attachment 3016750 View attachment 3016751

Fault in model data?

(Edit: added a enlarged section incase it's difficult to see on the forum)
View attachment 3016947
How exactly are you upscaling that? I believe you use A1111, correct? Are you using the Extras tab, or is this using something like the SD Upscale or SD Ultimate Upscale scripts in the img2img tab?

I ask because this, in addition to your render times, looks like a img2img render with one of those scripts that got interrupted mid process. A simple upscale using one of those upscalers in the Extras tab should not take more than a few seconds to complete even on a low-to-mid range GPU.
 
  • Like
Reactions: Mr-Fox

Mr-Fox

Well-Known Member
Jan 24, 2020
1,401
3,805
Since there was some new upscalers mentioned i wanted to test and it gave a nice reminder that highres testing is not for me :(
2x upscaling a 512x512 image took just 3min, a 768x768 took 13min, and it overflowed into shared memory by 11gb on the final step/percentage generating.
3x upscaling on 512x512 took 12,5 min...so yea, not really sure i'm gonna bother with anything higher.
Seems it was a long wait for more than just me, even the woman in the image went old and grey
View attachment 3016726
On a more serious note, look a her hair about a bit below her chin, at first i thought something was wrong with the upscaling, but i've included a non upscaled version i generated afterwards and it's there too.
Third image is the 2x upscale, different seed and the same type of "framing" is there too, on the left side.

View attachment 3016750 View attachment 3016751

Fault in model data?

(Edit: added a enlarged section incase it's difficult to see on the forum)
View attachment 3016947
Yes hiresfix is notorious for being slow, it's the hires steps that takes time. This script or extension can also be at fault for memory leaks etc.
The more steps the better to a point but it also will take longer. Most of the time though it works well enough for a mere peasant pleb like me with a 1070 8Gb card. I typically use 640x960 for potrait ratio images and then upscale by 2x with hiresfix since my rig cant go any further. This gives me 1280x1920 and it's plenty. The rest is up to the ckp and prompt in combination with anything else such as controlnet loras and all other fun toys.
If this is not enough than the problem is not with SD. Then you can ofc go further and use SD Upscale or go straight to photoshop for those final touches. An alternative route is ofc to skip hiresfix and only using SD Upscale. I would treat the sample steps in img2img as the hires steps and set them 2x to the sample steps in txt2img. The denoising strength is something that is probably specific for each use case and needs testing to find the right value. Depending on all variables such as style choice, ckp, prompt and loras etc. Something also to consider is that the denoising strength might be specific for each upscaler. I have not thought of this before, it just occurred to me. Something to test for sure.

I have no idea what is the cause of that blurry line or area in your image, sry. It's very late here so I can't attempt any trouble shooting until tomorrow.
 
Last edited:

me3

Member
Dec 31, 2016
316
708
How exactly are you upscaling that? I believe you use A1111, correct? Are you using the Extras tab, or is this using something like the SD Upscale or SD Ultimate Upscale scripts in the img2img tab?

I ask because this, in addition to your render times, looks like a img2img render with one of those scripts that got interrupted mid process. A simple upscale using one of those upscalers in the Extras tab should not take more than a few seconds to complete even on a low-to-mid range GPU.
highres function in a1111 text2img, and as Mr-Fox mentions in his post, it's slow and really likes eating memory,
If it'd had been just through img2img etc i'd suspected it was due some kind of tiling, but that wouldn't account for the none upscaled image, that has no upscaling nor highres fix, i even generated it again without vae to see if it was that but still there.
 

hkennereth

Member
Mar 3, 2019
239
784
highres function in a1111 text2img, and as Mr-Fox mentions in his post, it's slow and really likes eating memory,
If it'd had been just through img2img etc i'd suspected it was due some kind of tiling, but that wouldn't account for the none upscaled image, that has no upscaling nor highres fix, i even generated it again without vae to see if it was that but still there.
I find that strange because... yeah, A1111 is always pretty slow, and I don't know exactly what hardware you're running it in, but on my RTX 3060 with 8GB VRAM, which is okay but not amazing for SD, generating images with highres fix would take about 2 minutes to render, which a lot longer compared to what I get on ComfyUI (that's more in the 30 seconds range), but definitely not the 10-15 minutes range you're describing.

I know I must sound like a broken record, but you guys really should look into using ComfyUI. A1111 is just too poorly optimized to use comfortably with low-end hardware. Hell, even would give you a better experience, even if a more limited one.
 

Jimwalrus

Well-Known Member
Sep 15, 2021
1,059
4,045
I find that strange because... yeah, A1111 is always pretty slow, and I don't know exactly what hardware you're running it in, but on my RTX 3060 with 8GB VRAM, which is okay but not amazing for SD, generating images with highres fix would take about 2 minutes to render, which a lot longer compared to what I get on ComfyUI (that's more in the 30 seconds range), but definitely not the 10-15 minutes range you're describing.

I know I must sound like a broken record, but you guys really should look into using ComfyUI. A1111 is just too poorly optimized to use comfortably with low-end hardware. Hell, even would give you a better experience, even if a more limited one.
I wasn't aware there was a substantial performance improvement - Sepheyer was too busy telling us how easy it is to use (if you enjoy gazing at plates of spaghetti!) ;)

I'm experiencing a lot of issues with A1111 atm, including all kinds of errors on start-up. Also, generating an image takes about 3mins with 60 HiRes steps.

May well install it and have a go, when I get time.