If you just add one costume for one animation then I agree with you. I am talking about, say 5 sets clothes, for all existing animations. That would means about 5 times animations needed. It is not difficult, since the dev could reuse most of the animation, of course, but repetive and tedious.
Now check your folder, rae already has 370MB of animations (which of course mostly scenes), plus 500MB pictures (which are maybe mostly events). Just multiply 370MB by 5 gives you 1.85GB, whereas the whole game now is about 5GB.
As for scripting, there is not any built in things like which outfit is the girl wearing, the game only knows show this picture, play that animation, etc. So at least you need to define a variable, and check the variable then pick out which bunch of animations need to be played. So suppose your animation has 3 stages, each with 5 sets of clothes, then you would need something, at each stage, like, if cloth is this, play this animation, if cloth is that, play that, else play that, etc, or you create 5 different scenes from the start, in any cases you have to tell the game, one by one, all the 15 animations to be played. It can get tedious quickly, considering all the animations existing.
This is not even the most annoying part. For this high quality game, you naturally want immersion, like you come across a girl in some random clothes, have fun with her in her clothes. Then you also want animations, conversations and everything fit her clothes......
You may check the rae in apron scene and the same scene for rae in ordinary cloth. Dev basically writes two entirely different scenes, with only some lines of conversation being the same.