I've tried to build a tool for that at some point, but the OCR of all the models I've tried failed so often that using it was pretty much the same amount of work than translating it manually.
So now when I'm translating picture based text I tend to use Yandex or Google Lens to find out what it means (Or if they fail, I actually translate the text manually one character at a time.) and then I write it on the picture using Photoshop.
It's slow, tedious, and annoying, which is why I pretty much stopped doing that for requests...