To give a bit of an update to the SLR situation I've basically restarted the entire project.
Not to change the output, but make it user friendly and eat a lot less resources.
Because it had completely inflated demand and tons of unnecessary parts.
For example I had the Sugoi neural network loaded while parsing the game files, which is completely pointless, since it doesn't even do anything, yet.
Also tons of old redundant failsafes and optional stuff that just does not matter when you only work on RPGM.
I've also started to integrate it into an old gutted version of Translator++. It doesn't have much to do with the actual Translator++ (I stripped it completely and changed the parsers, etc.), but that way people are familiar with what they're looking at, instead of just staring at spooky command line stuff and raw code. Also it allows you to use different translators if you don't like Sugoi for some reason. (But it still will only work for Japanese to English.)
Also the Translator++ cell color system should really help people understand what the heck they're looking at.
For example right now I've made it so that all normal dialogue is tagged white,
every successfully detected and formatted script is green,
everything that either does not show up in-game or is pointless to translate blue,
every script that has been detected but may need additional formatting yellow,
and everything that could not be assigned or is harmful red.
That way if someone has no clue what they're doing they could simply just process white and green cells, and while that means there will be untranslated bits, at least for MV/MZ there's pretty much no room for errors. (Much less risk than if you use the actual Translator++ stuff.)
It's obviously still very rough and error prone compared to my original system and there's a limit to optimization (it will never be a viable option for a weak laptop), but I just don't see a normal human being using my original bloated shit.
The OCRSMTL stuff is basically pointless to pursue further. It was fun making it, but it takes an insane rig to use and the result isn't better than Google Lens, so there's just no real benefit in using it.
You can just bot that service instead.