Real-time Translation with AI and Python

Excedrinboy

Newbie
Oct 16, 2019
30
56
(originally posted this in a different section of the forum, but I think this is probably a better location for it)

I downloaded a game, not realizing it was in Japanese, and it got me thinking, "Can't I just write a program that can translate what's on my screen in real-time?"

So, I gave it a whirl. This is leveraging an AI vision model to "See" the screen, and using other translation services to translate, with a bit of crappy programming to display it.

The translations are ROUGH, but you'll be able to navigate through menus and translate things on the screen that are in pictures, which is helpful. The latency is there, but once you get into kind of a "groove" between it "looking" at the screen and clicking, you can reduce it.

This is more a proof of concept than anything else. It isn't pretty looking. I just wanted to see if it was possible, and I had a little time this morning to mess with it. Definitely, doable. Someone with more skills should look into it, if it hasn't been done yet.

If someone HAS already done this, please let me know in the comments so I can use it instead of my janky solution.
Untitled.jpg

(very much not sfw, forgive the excessive boobage, we all have our thing)
 
  • Like
Reactions: Saint Blackmoor

Excedrinboy

Newbie
Oct 16, 2019
30
56
Improved the translation quality. Still has some weird glitches, but it's better.


Things I want to improve:

1. Formatting - Right now the text is simply displayed in a paragraph, rather than the format it's displayed on screen. This can make viewing menus confusing. Ideally, I'd like it to mimic the formatting of the on-screen text.

2. Getting and sticking with a translation - The AI is refreshing the image it's looking at every 2 seconds at the moment. This is so it can catch text that moves along without pausing, such as when a cutscene is playing. However, the AI is also retranslating the text every time. This ends up with the text sometimes changing slightly when it does so. I need to code in a check so that if it sees the same text on screen, not to retranslate it. If text is slowly appearing on screen, it also may start a translation midway. I wonder if there is a way for it to check for punctuation and, if it doesn't see it, skip that cycle? Though, what if it's written in such a way that it doesn't use punctuation for dramatic effect? Hm.

3. Better translations - I've improved this drastically already, but it still needs to be better. Translations seem to improve with each refresh, but other times, they get worse or stay strange. I'll need to program in some sort of coherency check. I've already coded a pseudo version of this, as before, it didn't know how to handle things when NO text was on screen and ended up interpreting things as random letters or symbols. I was able to fix this, but it needs all-around improvement.

4. Aesthetics - Really last on my priorities, but the window the translations are displayed on is super clunky and ugly and just a means to an end. Ideally, I'd like it to be resizable, be able to change the font, font size, etc. At the moment, my main concern is getting the translation and formatting working properly. Untitled.png