I see you haven't heard of controlnets.
I'm talking about multiple images from different angles in the same scene, if you read beyond the tldr. I'm aware that a character can remain generally consistent & that you can change to a preferred angle/pose via reference image. I should've defined what I mean by character as I'm not simply talking about the face. There will be slight changes in clothing, hair, etc, from different angles. Looks "similar" but not the same, which anyone scrutinizing for a few seconds can easily tell. Not ideal for storytelling currently because of that. If controlnets or another system now has a way to easily define each character in each outfit & hairstyle to remain truly consistent at all angles (no extra holes in belts, slight texture changes & garment alterations, etc), I'd honestly love to know about it.
Getting the right subtle facial expressions also seems like a tedious task in training & capturing unless there's an easy way to img-to-img swap a reference photo of the exact expression you want, to apply to the base image's character without losing the integrity of their facial structure or otherwise altering the base image. (Honestly, if there is, I'm all ears). Most pre-trained expressions tend to be generic & lack character or subtlety.
I've no doubt much of this will be quickly improved & that there are current capabilities either not accessible to the general public, expensive, or requiring full-time dedication. But that's not useful currently to independent creators looking to pass off art to a machine so they can focus on other elements. At best it's a supplemental aid right now to to save time on some portions of art creation, used selectively.