Tool A simple translator using a pre-trained AI model and PyQt.

wsnlndr · Jun 19, 2024

Here I have been playing around with ChatGPT to get information about how to use a model to translate at home, without having to resort to online translators.

What I have discovered, and it is not surprising, is that to use an AI model and make it work quickly, you need to have powerful hardware resources.

In this case, the tool is an interface to translate text from English to Spanish.
You can always use an appropriate model to translate into another source and destination language or even use a large model that contains many possibilities in a single model.

The ideal is to use Cuda cores of the GPU to speed up the process since the CPU is not usually the most optimal for this task, but if the graphics card you have is not powerful, the process will not be fast to translate a simple line of text , but if you have more than one graphics card the process can be significantly accelerated with the appropriate code adjustments.

You can get the models for free at

You must be registered to see the links

The interface is as simple as possible using PyQt6 on GNU/Linux.
It would be ideal to automate the translation process to apply it to batch files.

Source code:

You don't have permission to view the spoiler content. Log in or register now.

wsnlndr · Jun 19, 2024

To use a translator like this for Renpy or other type projects, it may be interesting to implement the use of rules to avoid unpleasant surprises in the processing of the text that is translated and also improve the interface by giving more configuration options.

Example of filter lines of text by rules:

Python:

def apply_rules(words, rules):
    total_words = 0
    total_characters = 0

    # Count the total number of words and characters before applying the rules
    for rule in rules:
        if rule == "Count Words":
            total_words = len(" ".join(words).split())
        if rule == "Count Characters":
            total_characters = len("".join(words))

    if total_words > 0:
        print(f'Total number of words: {total_words}')
    
    if total_characters > 0:
        print(f'Total number of characters: {total_characters}')

    processed_words = []

    for word in words:
        for rule in rules:
            if rule.startswith('Remove Spaces'):
                word = word.replace(' ', '')
            elif rule.startswith('Convert to Lowercase'):
                word = word.lower()
            elif rule.startswith('Convert to Uppercase'):
                word = word.upper()
            elif rule.startswith('Remove Punctuation'):
                if '(' in rule and ')' in rule:
                    parameter = rule[rule.index('(') + 1:rule.index(')')]
                    word = remove_punctuation(word, parameter)
            elif rule.startswith('Remove Final Period'):
                word = word.rstrip('.')
            elif rule.startswith('Remove Empty Lines'):
                if not word.strip():
                    word = ''
            elif rule.startswith('Remove Numbers'):
                word = ''.join(filter(lambda x: not x.isdigit(), word))
            elif rule.startswith('Replace Characters'):
                if '(' in rule and ')' in rule:
                    parameters = rule[rule.index('(') + 1:rule.index(')')].split(',')
                    if len(parameters) == 2:
                        word = word.replace(parameters[0].strip(), parameters[1].strip())
            elif rule.startswith('Remove Short Words'):
                if '(' in rule and ')' in rule:
                    length = int(rule[rule.index('(') + 1:rule.index(')')])
                    if len(word) < length:
                        word = ''
            elif rule.startswith('Remove Long Words'):
                if '(' in rule and ')' in rule:
                    length = int(rule[rule.index('(') + 1:rule.index(')')])
                    if len(word) > length:
                        word = ''
            elif rule.startswith('Remove Specific Words'):
                if '(' in rule and ')' in rule:
                    specific_words = rule[rule.index('(') + 1:rule.index(')')].split(',')
                    if word in specific_words:
                        word = ''
            elif rule.startswith('Keep Only Letters'):
                word = ''.join(filter(lambda x: x.isalpha(), word))
            elif rule.startswith('Reverse Text'):
                word = word[::-1]
            elif rule.startswith('Sort Alphabetically'):
                word = ''.join(sorted(word))
            elif rule.startswith('Add Prefix'):
                if '(' in rule and ')' in rule:
                    prefix = rule[rule.index('(') + 1:rule.index(')')]
                    word = f'{prefix}{word}'
            elif rule.startswith('Add Suffix'):
                if '(' in rule and ')' in rule:
                    suffix = rule[rule.index('(') + 1:rule.index(')')]
                    word = f'{word}{suffix}'
            elif rule.startswith('Remove Words with Special Characters'):
                word = ''.join(filter(lambda x: x.isalnum(), word))
            elif rule.startswith('Capitalize Words'):
                word = word.capitalize()

        if word:
            processed_words.append(word)

wsnlndr · Jun 23, 2024

Some parameters have been added here to fine-tune the model's work a bit, there are probably better ways to do this,
either by training the model better with a larger amount of data or by configuring its parameters for heavier work,
but I'll leave that to whoever has better hardware.

You don't have permission to view the spoiler content. Log in or register now.

wsnlndr · Jul 14, 2024

And here we have an updated version of the simple translator, this is now ->Translateador de la pradera to pofesioná, V 0.0003

In order to use the program you must meet some requirements from pip:
huggingface-hub
transformers
langdetect
pyqt6
torch

In the installation of the model, some more things, some modules necessary to be able to work with CUDA will be installed, some are a little heavy but they are necessary.
In this case an Nvidia card is being used, I don't know if the same python modules or different ones are used to work with AMD, I haven't investigated that, sorry!

Each individual model can weigh more or less on the hard drive, in my case I seem to remember that the model I am using weighs about 315 Mb, there are much heavier models, but those require a lot of GPU power and are usually used in data centers.

This program uses the GPU, the more powerful the graphics card = the faster the program will work, you can also use the CPU but unless you have a threadripper, I would use the GPU and its CUDA cores.

Monitor the temperature of your GPU!
This is a recommendation from those who usually play with AI, if they say it for a reason it must be, right?

When you have installed the model correctly and all the necessary modules, you can run the translation program, keep in mind that this program does not work fast, at least with my gtx 1070, it is not fast, maybe if you have a 3080 or higher, in In your case it may not be so slow, but this program cannot be compared with the speed of translators like Paloslios_official.

This program does not use cloud services or online translators, so you do not need any tokens or anything similar, once everything is installed, the program can work perfectly without an internet connection.

There are a series of buttons available that I am going to explain to clarify doubts:

Select folder -> will try to translate all rpy within a folder.
Load *.rpy, -> it will translate only the rpy that we select.
Translate -> Start the translation. Stop translation -> stop translation.
Close -> closes the program.
Detect -> detects the rpy, rpa, rpyc of a folder and shows them in the list...This function is not very useful, I think.
Unzip uses the system decompressor (GNU/Linux) to decompress a game.
Delete-RpyC -> Delete the selected rpyc files.
UnRpa -> makes use of unrpa which must be previously installed from pip ---->

You must be registered to see the links

UnRpyC-> makes use of unrpyc which must be previously installed from its repository at ---->

You must be registered to see the links

Something important to keep in mind, this program is useful for translations from the extraction of those made with Renpy, those that we do with the Renpy SDK or also translations of rpy files focused on the text and not on the game mechanics , because if the mix of text and game mechanics are very mixed up, the translation errors will be too many, enough to not want to use this translation, so you have to keep in mind that some files like screens are better not to translate. The extraction of translations "with Renpy" in this case must be done manually with the SDK, the program does not have that functionality at the moment.

Conclusion: 99 percent of the time I use the Paloslios translator, because it is fast as thunder and effective, but it can rarely fail, especially because it is a program for Windows that is running with Wine on Linux and because Sometimes some novel and game Devs don't have time to edit several files and instead do their ["magic"] in a single line, forgetting the good programming practices of the Renpy SDK.
(Be careful, I'm not complaining, it just happens sometimes.)

So this program has helped me to be able to rarely translate my favorite novel, the resulting translation is not usually free of errors, some line that is not translated, some poorly translated line or some line that blocks the start of the game or novel, those I try to correct it by hand.

And finally, this program can be improved a lot, I'm sure, after all, I'm not a programmer, just an amateur. If the program helps you or you make a better implementation of it, it will have been worth the time to post the code! All the best.

A screenshot of a game in which I use this program.

Here the source code of the model installer and the translator.

I use few models from ->

You must be registered to see the links

, for En-Es, Ru-Es, Pt-Es, etc...

You must be registered to see the links

You don't have permission to view the spoiler content. Log in or register now.

View attachment translateador.zip

Paloslios_Official · Jul 16, 2024

wsnlndr said:
The extraction of translations "with Renpy" in this case must be done manually with the SDK, the program does not have that functionality at the moment.

I give you a clue?
I'm sure you can understand it

Bash:

E:\MyLoyalPets-1.0-SE-pc\lib\py3-windows-x86_64/python.exe E:\MyLoyalPets-1.0-SE-pc\MyLoyalPets.py --empty E:\MyLoyalPets-1.0-SE-pc translate spanish

extraer_traducciones.zip

wsnlndr · Jul 17, 2024

And here we have an extractor that in principle worked excellent, I have to test it more exhaustively, but for now it does not disappoint,
this is Paloslios' contribution so that Gnu&Linux users can make native translations.

The only thing you have to do is run the python script, select the base folder of the game or novel and in a short time we will have a "/game/tl/spanish" folder or any other language with the rpy files and their lines that will later be translated by the...
Translateador de la pradera to pofesioná V-0.0003.

The fact is that the way to translate extraction rpy files in the tl/idioma_any/ folder is ideal for the translator based on a pre-trained model, since its implementation of this was focused on this translation mode,
despite the fact that part of the code has other paths, so now with the extractor, probably some things from the translator are superfluous.
In any case, there should not be any big problems using the extractor as a basis for work.

For the game or novel to use the translation that we have added, we must indicate its in the code, I think it is case sensitive, so be careful with uppercase and lowercase letters.

Python:

init python:

    config.default_language ="spanish"
    config.language ="spanish"

Mil gracias Paloslios por la ayuda. == Thank you very much Paloslios for the help.

Python:

for i in range(1000):
    print("Gracias Paloslios")

Source code of the extractor:

Python:

#!/usr/bin/python
import sys
import os
import subprocess
from PyQt6.QtWidgets import QApplication, QFileDialog

def select_game_folder():
    app = QApplication(sys.argv)
    folder = QFileDialog.getExistingDirectory(None, "Select the game folder")
    if folder:
        return folder
    else:
        print("No folder selected.")
        sys.exit(1)

def find_python_executable(base_path):
    potential_dirs = ["py3-linux-x86_64", "py2-linux-i686"]
    for dir_name in potential_dirs:
        full_path = os.path.join(base_path, "lib", dir_name, "python")
        if os.path.isfile(full_path):
            return full_path
    print("Python executable not found.")
    sys.exit(1)

def find_game_script(base_path):
    py_files = [f for f in os.listdir(base_path) if f.endswith(".py")]
    if len(py_files) == 1:
        return os.path.join(base_path, py_files[0])
    elif len(py_files) > 1:
        print("Multiple .py files found. Unable to determine the correct game script.")
        sys.exit(1)
    else:
        print("No .py file found in the game folder.")
        sys.exit(1)

def main():
    game_folder = select_game_folder()
    python_executable = find_python_executable(game_folder)
    game_script = find_game_script(game_folder)

    translate_folder = os.path.join(game_folder, "translate", "spanish")
    os.makedirs(translate_folder, exist_ok=True)

    command = [
        python_executable,
        game_script,
        game_folder,
        "translate",
        "spanish"
    ]

    subprocess.run(command)

if __name__ == "__main__":
    main()

randysum · Oct 27, 2024

You must be registered to see the links

Has some good comparison's of the LLM's their ability to translate and their ability to be explicit.

wsnlndr · Oct 27, 2024

randysum said:
You must be registered to see the links

Has some good comparison's of the LLM's their ability to translate and their ability to be explicit.

@randysum Thanks for the link!

These python codes above to have a translator at home that works in offline mode are good as an emergency tool for Ren'py games, it is good if the connection is not stable for some reason, also as a small project to experiment and even learn it can be useful.

But the reality is that a lot of GPU power is needed for the translation speed to be attractive, today we want everything done in 1 second and in this case this method of translation, fails miserably in speed (my PC & GPU).

On the other hand, the quality of the Helsinki-NLP models is high and are trained only for specialized translation for a source language and a target language, if the job is done well, I prefer a ~400Mb model to a ~4.7Gb (7B) In my case my GPU cries at 7B

Of course, everything will depend on how powerful the GPU in question is for off line mode.

Tool A simple translator using a pre-trained AI model and PyQt.

wsnlndr

Newbie

wsnlndr

Newbie

wsnlndr

Newbie

wsnlndr

Newbie

Paloslios_Official

Member

wsnlndr

Newbie

randysum

Newbie

wsnlndr

Newbie