I really like the renders you are currently producing, however to produce the style that you are achieving takes more time. The reason for this is the light sources you are employing (different colours / similar intensity). You often have competing light sources, hence each iteration of the render produces different results. As a result it take more iterations to get the image to converge sufficiently.
From my perspective, this is not a fault is what you are doing, just a consequence of what you are trying to achieve. To date one of the best renders I have ever done is in the spoiler below. Whilst the light sourcing is simple, I have used an environment box to scatter the light to simulate dust. From memory this was about a 6 hour render at 4k and it did not even remotely come close to converging (maybe 1%), however the lights from the police car have substance, as they create a cone from the scattering environment.
If you are doing a low quantity of renders, then focus on the lighting having meaning. If a light source has a purpose, leave it in the scene, if it does not then remove it. Many of the best artistic renders do not converge, because the light path during rendering is complex, just like the real world. If you are looking for speed, then use HDRI or lighting setup similar to 3-point such that the light sources do not compete with each other. Personally I prefer to live somewhere in the middle on my VN, where I weigh up rendering speed versus artistic merit. As such many of my images have different rendering times (iterations) and unfortunately I have to re-render a significant portion (~30%) when this doesn't work out.
PS: Many people have different hardware at their disposal. A more meaningful comparison is to compare the quantity of iterations to achieve the desired result, as this is independent of hardware / time. Also in relation to iterations, often better results are done with far fewer iteration on a higher resolution render, than a large number of iterations at lower resolution.