In theory (from the perspective of someone who knows very little about coding but a moderate amount about the other bit) there's a bit of a tradeoff here. Using full pictures makes it much, much simpler to set up the scenes in the game. You basically just make it cycle through a gallery alongside the script. The downsides are that this method makes the games file size bigger, and you need someone with at least a basic knowledge of image editing programs to put the pictures together (which I assume is still a task nomo is handling). On the other hand having a bunch of modular images that are overlaid means you can keep your filesize down and means that new scenes can be made without any input from the artist once you have enough assets built up. The problem is that (depending on how many elements are in a scene) you're now juggling a lot more things. This is probably a non-issue for an actual programmer, but for an amature who only knows enough to get renpy working it can be pretty fiddly and time consuming, which is why single images are what most games vn style games use.they basically have two options for the modular scenes:
1. Just use full separate pictures to illustrate the scene and just switch to a full other picture if something like a facial expression or hand placement changes. More or less what they have been doing all along in the h-scenes.
2. Have most of the scene as a "background" and add those small changes over it as a layer. Some of the groping pictures work like this, normal conversations and clothing options do this too and there are a few hundred games using this method in this site for further reference.
In theory, once the art assets are done, the writer and the coder could make new scenes on their own and the artist could work on new stuff, so this should be a good method.