I think that 3D engine he is rendering the sex scenes - that takes him significant amount of time to setup. I know, i know, prepared models do not take time... but placing characters, poses, and the background environment - i imagine this takes him some time. Also they often change outfits, like girls dresses or skirts, mini-skirts/micro-skirts - are different, right?
Props to him, that he tries different angle scenes, not simply using exact same poses and simply replacing just the actors. This however makes that "bloat". Because you can't use same "scene background" and just add a layer of actors on top of it, right?
Edit: i was thinking, if actor renders would be scaled to background - he could make them separately and just show them as image layer on top of background layer. Which means he would need just one background, and that would highly reduce the size of pictures. Same way that H.264 codec is compressing "static" parts of the image. That static part would be just one scene background. But keeping same background inside them and between multiple pictures - is actual bloat-data.