That's the standard for all girls. I think the issue is that the text is part of the image and as such a new image is needed for each reason. Combine this with various clothing options and it's suddenly a lot of images. This will not only make selecting the right one more complex, it will also add to the download size.
I suppose it's possible to have one image and then use the character and text as two sprites. This will greatly cut down on the download size (greatly relative to this scene, not so much total), but it still sounds like a pain to make, both to do it and time required to do it. I would prefer it to stay the way it is and let the time be spend elsewhere. That is unless this specific task can be outsourced to somebody, who wouldn't have done anything else anyway.
It should be mentioned that the added complexity will also increase the risk of bugs.
Some visual novels will have details like this working, but they are usually based on drawn sprite based graphics makes having multiple versions of one sprite trivial. HM is based on screenshots and it requires a new screenshot to replace one single detail. For this reason those two approaches can't really be compared.
I agree with you regarding the different techniques that can be used.
However, not only does this lead to inconsistencies, but it also highlights that the technique of successive screenshots is obsolete.
Modern techniques for creating and modifying images not only allow faster work, therefore saving time, but also great working flexibility. Cinema and photography use green backgrounds. Video game creators use similar techniques such as sprites that are superimposed on each other.
As for the coding, it is no more complex than superimposing the dialogues on the images. Assuming that this increases the risk of bugs, each line of code represents the same potential risk.
Just out of curiosity, where would you prefer Devs to spend time?
I know it may seem like an insignificant detail, but it's the details that make things perfect, if perfection exists.
So the mistake (imo) was made from the start. The technique used makes corrections of this kind difficult. As with Phoshop or Gimp, and so on creations, in their original format, it is easy to modify or change a layer, but if you convert it to JPG or PNG, everything has to be redone.
♂