Seems anything other than a generic background causes massive issues? I like the room being a tent, and looking at the bench in the icecream one makes my brain hurt trying to process it
You apparently have a severe lack of knowledge on how AI images work, but that's fine. No need for people who don't actually use the technology to know.
In stable diffusion and comfyui, using image to image to generate an image is mostly just futzing with denoising levels. At 1.0 it basically replaces everything in the image, at 0.0 it just copies the image exactly...no AI anything. So you have to futz with the denoising and the AI, which isn't real AI mind you, makes "guesses" at certain things. The central focus of the image, depending on model (checkpoint) used, becomes a more realistic version of the 3d render provided. Everything else is an approximation that the checkpoint makes its best guess at.
Sometimes you can use a text prompt to get details correct, an example is the ice cream cone in one of the Luna pictures. It just wouldn't get that right until I added it to the prompt. Other times, no amount of messing about will get a door to match exactly or a bench to look the same. Etc.