Tool RPGM SLR - Offline JP to EN Translation for RPG Maker VX, VX Ace, MV, MZ, and Pictures

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
Links for the models mentioned above.

static/unweighted



imatrix/weighted

I would guess that most people will get them through the LM Studio UI.
Also I mean this 8B model:

Not entirely sure what the difference to the v1 is, but I've only tested the v2.
 

Entai2965

Member
Jan 12, 2020
178
532
170
I fixed the links. There are quite a few different 8b models. v2 is different than v2.0 apparently.
 

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
I fixed the links. There are quite a few different 8b models. v2 is different than v2.0 apparently.
No clue... :unsure: They are exactly the same size and were built from the same .

Maybe he published it as v2.0 accidentally and then just made a new post with just v2? :KEK:
 

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
I've implemented a very simple fallback model option in v2.027 now.
Basically it's 3 new options to set another api url, model name, and optional api key.
And whenever a LLM fails so badly that it would normally abort the batch translation, it will now try the current request batch again with the chosen fallback model using the same prompts, settings, etc.
 

fantasmic

Member
Modder
Nov 3, 2021
499
1,254
267
And as if that wasn't enough, yet, It's a VL model, it has vision. You can use it for SEP picture translations, too. (Although I wouldn't use the 8B variant for that, that probably wouldn't be a meaningful step up from just using paddle.)
I've been using QWEN2.5 7B for SEP and feel that it's a meaningful step up from just using paddle. Maybe not "you should buy a new gpu for this", but definitely "best use it if you have it." I'd expect QWEN3 8B is probably in the same boat.
 

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
I've been using QWEN2.5 7B for SEP and feel that it's a meaningful step up from just using paddle. Maybe not "you should buy a new gpu for this", but definitely "best use it if you have it." I'd expect QWEN3 8B is probably in the same boat.
Since OCR stuff takes "relatively" little tokens and you usually only have a few pictures that need translation I don't really see why someone shouldn't use the 32B version running on CPU though. All that's needed for the IQ3_XS version for example would be 16GB of ram. That's like $30 if you buy the Chinese stuff.
 

ArcanaLilith

Newbie
Jul 22, 2020
29
36
142
I'm using DSLR and trying to export the project which results in only plugin.js to be exported.
I tried exporting Actors.json by itself and this is the error I am getting
I tried exporting to a folder and as a zip

You don't have permission to view the spoiler content. Log in or register now.
 

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
I'm using DSLR and trying to export the project which results in only plugin.js to be exported.
I tried exporting Actors.json by itself and this is the error I am getting
I tried exporting to a folder and as a zip

You don't have permission to view the spoiler content. Log in or register now.
On v2.027? Was the project created on the same version?
Which game are you working on?
Also send me your .trans file and the matching cache folder from www>php>cache.
 

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
Having done a lot more testing and small adjustments (v2.028), I'm kinda disappointed with Qwen3 VL 32B Instruct abliterated v1 I1, there's really not a lot of difference in performance compared to Qwen3 VL 8B Instruct abliterated v2.0 I1.

The 8B model is still great for it's size, but using the 32B as fallback really doesn't help much, it only manages the section 8B failed at around half the time, which really isn't a good ratio for a super slow last resort. It's surprisingly bad at keeping track of placeholders.

From my current results I would honestly recommend using gemma3-27b-abliterated-dpo-i1 as a fallback model instead. That's not a "great" model, but it's so different in what it fails at that the chance of it fixing qwen3 fuckups is actually pretty high.

Another probably even better option (if it doesn't have to be offline) would be to just use the free daily requests of OpenRouter as the fallback.
Even an account that never bought any credits gets 50 free requests a day, and since you would only need it after a complete failure that should actually last for a not overly large translation.
To use a free model you just put :free at the end, meaning the model name for deepseek would be:
deepseek/deepseek-chat-v-0324:free

With 2.028 there are no "complete" failures anymore, I've changed it so that it will always accept a response even if it's wrong, but puts an error code in that cell.
That way a single error loop no longer cancels the entire batch translation and you can just translate that broken cell manually afterwards when you would be looking for error codes after pressing the Fix Cells button anyway.

Edit: I'm suddenly having really bad luck with qwen3 vl while testing on a different game. I'm not sure what's going on, but it's constantly fucking up right now.
 
Last edited:

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
Okay finally finished a serious offline DSLR test with the new models.
This is only for someone with bad hardware. If you have a high-end rig the tests and conclusion do not apply to you at all.

On a 9 year old PC that was expensive back then, but never received any hardware upgrade.
250000 characters of text, that has a lot of @1 placeholders \n[] commands and \c[] commands.

Test 1
Primary model Qwen3 VL 8B Instruct abliterated v2 I1 (Q5_K_S)
Fallback model Gemma3 27B abliterated dpo I1 (Q4_K_S)

The translation took ~8.5 hours. Not a single complete failure.

Test 2
Primary model Qwen3 VL 4B Instruct abliterated v1 (Q8_0)
Fallback model Qwen3 VL 8B Instruct abliterated v2 (Q8_0)

The translation took ~3.5 hours. 7 cells failed completely and need manual fixing.


Obervations:
Test 1 was a failure on this hardware.
The Primary model actually screwed up a lot constantly requiring the large fallback model to step in which is why it took so long.
8.5 hours for a mediocre translation of a medium sized game is not acceptable.
If you think about the monetary value of the power consumption, wear, and it simply blocking the PC, you could pay deepseek instead and have a faster better translation.

Test 2 was also a failure, although a less bad one.
3.5 hours is still really long considering it had complete failures.
The Primary model actually performed basically the same as the primary in the first test.
The fallback model was too big to offload into the old GPU and as a result pretty slow and did not have a particularly good rate at fixing stuff. Meaning on those 7 complete failures it just wasted a really long time failing.

Current Conclusion:
If you have outdated hardware I would use the biggest abliterated Qwen3 model you can fully fit in your GPU as a primary model and then use DeepSeekV3-0324 as fallback option with the free requests on OpenRouter.
That would be quite fast, free, you would not have a single complete failure, and if the game you are translating has a lot of \c[] commands and stuff like that you will likely still get a significantly better translation than with SugoiV4 based SLR.

If you absolutely want to keep it offline I would honestly just turn the fallback option off and once it finishes manually fix the cells with the TRANSLATIONFAILURE error code (Worst case just run normal SLR on them). That would be so much faster than trying to make a huge model do it for you.

If none of the models fit in your GPU, stick to SLR, it's not worth it.
 
  • Like
Reactions: RTS23 and iYuma

ndj865

Newbie
Dec 22, 2017
58
34
243
@[B]Shisaye[/B]
First you need to in Firefox you need to type command 'about:config' then you need to search for 'security.fileuri.strict_origin_policy' and turn it to 'false'
You can get it to run locally from Chrome if you start the browser from a terminal using the flag:
--allow-file-access-from-files

For Windows
start chrome --allow-file-access-from-files

For MacOS
/Applications/Google\ Chrome --allow-file-access-from-files

For Mac Linux
google-chrome --allow-file-access-from-files

To start SLR Translator in web browser is to go in 'www' folder and double click on 'trans.html'
 

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
@[B]Shisaye[/B]
First you need to in Firefox you need to type command 'about:config' then you need to search for 'security.fileuri.strict_origin_policy' and turn it to 'false'
You can get it to run locally from Chrome if you start the browser from a terminal using the flag:
--allow-file-access-from-files

For Windows
start chrome --allow-file-access-from-files

For MacOS
/Applications/Google\ Chrome --allow-file-access-from-files

For Mac Linux
google-chrome --allow-file-access-from-files

To start SLR Translator in web browser is to go in 'www' folder and double click on 'trans.html'
Huh... I had no idea it works like that. Interesting.
What is actually the benefit though?
I mean it's still only going to work on windows because of all the windows specific packages.
 

ripno

Member
Jan 27, 2023
132
335
131
Hi Shisaye,

I’ve been experimenting with the SLR translator (version 2.029) using the DeepSeek 3.1 Terminus model from ’s build site. The usage limit is 40 requests per minute for free tier, but unlike other servers such as , I couldn’t find any daily token limit.

I tested it through and ended up consuming more than 6 million tokens in just a few hours, mainly to check which values of temperature and top_p would give the best results.

In the DSLR section of your translator, the only fields I modified were:

  • API URL =
  • Model = deepseek-ai/deepseek-v3.1-terminus
  • Temperature = 0.1
  • Request delay = 1600
  • Top K = 0
  • Top P = 0.7
  • Min P = 0
However, when I checked the console (F12), I saw these errors:

[SLRPersistentCacheHandler] No cache found for dslr.
[SLRPersistentCacheHandler] Attempting to load backup cache for dslr.
Failed to load resource: the server responded with a status of 400 ()
[DSLR] Response failed ok test. Response Error Status: 400
redGoogle is not a translator engine.

My initial guess is that the request code might be different from what the server expects.
Python:
from openai import OpenAI

client = OpenAI(
  base_url = "https://integrate.api.nvidia.com/v1",
  api_key = "$NVIDIA_API_KEY"
)

completion = client.chat.completions.create(
  model="deepseek-ai/deepseek-v3.1-terminus",
  messages=[{"role":"user","content":""}],
  temperature=0.1,
  top_p=0.7,
  max_tokens=16384,
  extra_body={"chat_template_kwargs": {"thinking":True}},
  stream=True
)

for chunk in completion:
  reasoning = getattr(chunk.choices[0].delta, "reasoning_content", None)
  if reasoning:
    print(reasoning, end="")
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")
I recommend signing up, verify your account and trying the AI directly on NVIDIA’s server. That way, you should be able to reproduce the same error and confirm whether it’s an issue with the request format or something else.
 
Last edited:

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,510
6,216
676
Hi Shisaye,

I’ve been experimenting with the SLR translator (version 2.029) using the DeepSeek 3.1 Terminus model from ’s build site. The usage limit is 40 requests per minute for free tier, but unlike other servers such as , I couldn’t find any daily token limit.

I tested it through and ended up consuming more than 6 million tokens in just a few hours, mainly to check which values of temperature and top_p would give the best results.

In the DSLR section of your translator, the only fields I modified were:

  • API URL =
  • Model = deepseek-ai/deepseek-v3.1-terminus
  • Temperature = 0.1
  • Request delay = 1600
  • Top K = 0
  • Top P = 0.7
  • Min P = 0
However, when I checked the console (F12), I saw these errors:

[SLRPersistentCacheHandler] No cache found for dslr.
[SLRPersistentCacheHandler] Attempting to load backup cache for dslr.
Failed to load resource: the server responded with a status of 400 ()
[DSLR] Response failed ok test. Response Error Status: 400
redGoogle is not a translator engine.

My initial guess is that the request code might be different from what the server expects.
Code:
from openai import OpenAI

client = OpenAI(
  base_url = "https://integrate.api.nvidia.com/v1",
  api_key = "$NVIDIA_API_KEY"
)

completion = client.chat.completions.create(
  model="deepseek-ai/deepseek-v3.1-terminus",
  messages=[{"role":"user","content":""}],
  temperature=0.1,
  top_p=0.7,
  max_tokens=16384,
  extra_body={"chat_template_kwargs": {"thinking":True}},
  stream=True
)

for chunk in completion:
  reasoning = getattr(chunk.choices[0].delta, "reasoning_content", None)
  if reasoning:
    print(reasoning, end="")
  if chunk.choices[0].delta.content is not None:
    print(chunk.choices[0].delta.content, end="")
I recommend signing up, verify your account and trying the AI directly on NVIDIA’s server. That way, you should be able to reproduce the same error and confirm whether it’s an issue with the request format or something else.
I don't know what its problem could be since it's supposedly using the same v1 format as the rest.
I tested DSLR with the official deepseek api, openrouter, and LM studio. They all have no issue with my request format.
Which is just basic fetch:
JavaScript:
                                    const response = await fetch(apiUrl, {
                                        method: 'POST',
                                        headers: Object.assign({
                                                'Content-Type': 'application/json'
                                            },
                                            apiKey ? {
                                                'Authorization': `Bearer ${apiKey}`
                                            } : {}
                                        ),
                                        body: JSON.stringify({
                                            model: activeLLM,
                                            messages: messages,
                                            temperature: llmTemp,
                                            top_k: llmTopK,
                                            top_p: llmTopP,
                                            min_p: llmMinP,
                                            repeat_penalty: llmRepeatPen,
                                            max_tokens: maxTokenValue
                                        })
                                    });
Can't really test it since they ignore me so far.
1763864137743.png

Edit: Another question I would have is concerning censorship. They state that requests are checked/filtered in some way.
Censorship is the reason I never made support for stuff like , because the chat on the official deepseek website is heavily censored and refuses certain requests, while the paid api version does not.
 
Last edited:

ripno

Member
Jan 27, 2023
132
335
131
I don't know what its problem could be since it's supposedly using the same v1 format as the rest.
I tested DSLR with the official deepseek api, openrouter, and LM studio. They all have no issue with my request format.
Which is just basic fetch:
JavaScript:
                                    const response = await fetch(apiUrl, {
                                        method: 'POST',
                                        headers: Object.assign({
                                                'Content-Type': 'application/json'
                                            },
                                            apiKey ? {
                                                'Authorization': `Bearer ${apiKey}`
                                            } : {}
                                        ),
                                        body: JSON.stringify({
                                            model: activeLLM,
                                            messages: messages,
                                            temperature: llmTemp,
                                            top_k: llmTopK,
                                            top_p: llmTopP,
                                            min_p: llmMinP,
                                            repeat_penalty: llmRepeatPen,
                                            max_tokens: maxTokenValue
                                        })
                                    });
Can't really test it since they ignore me so far.
View attachment 5462026

Edit: Another question I would have is concerning censorship. They state that requests are checked/filtered in some way.
Censorship is the reason I never made support for stuff like , because the chat on the official deepseek website is heavily censored and refuses certain requests, while the paid api version does not.
Hi Shisaye,

By the way, have you already verified your account on build.nvidia.com? When I signed up, I received a verification link sent to the email I registered with, and there was also a step to verify my phone number via OTP. Without completing both steps, the API key might not work properly.

Also, keep in mind that build.nvidia.com is an AI provider platform, not limited to DeepSeek. They host multiple models with different token limits — I’ll share the list of available AIs and their token limits below for reference.
You don't have permission to view the spoiler content. Log in or register now.

I also found a possible clue from Translator++ (post #1443 by Topdod). He solved a similar 400 error by editing the openai.js addon: changing response_format from 'json_schema' to { "type": "json_object" }. After that, DeepSeek started returning translations instead of 400 errors.
Error Type : Error while fetching: Error: 400 This response_format type is unavailable now
DeepSeek isn't happy with the format? Maybe an issue with the addon then?

So, in one final effort I did some Googling to see what DeepSeek expects, closed Translator++ and edited the opanai.js file (www/addons/openai/openai.js)
ctrl +F looked for "response_format", it was on line 60 in Notepad++
response_format: zodResponseFormat(z.object(getResponseSchema(texts.length)), 'json_schema'),
Edit it to
response_format: {'type': 'json_object'}

Tested a few single lines and now I get translations. What I don't get, is that OpenRouter is able to use DeepSeek just fine without the edit. I gave a quick test trying OpenRouter with another model to see if it worked. google/gemini-2.5-flash-preview-09-2025

Still translated the 2 lines I tested. But for all I know it could fuck with other models, so I'd keep 2 versions of the file just in case. This was all 100% dumb guess work. If anybody knows more, please go ahead and tell us.

tl;dr:
open www/addons/openai/openai.js (back it up)
ctrl +F look for "response_format", it was on line 60 in Notepad++
response_format: zodResponseFormat(z.object(getResponseSchema(texts.length)), 'json_schema'),
Edit it to
response_format: {'type': 'json_object'}
So maybe the NVIDIA endpoint is also rejecting the extra response_format or extra_body fields. Your fetch works fine because it doesn’t include those fields, while the OpenAI client adds them by default. That could explain why you don’t see the error on DeepSeek API, OpenRouter, or LM Studio, but I do on NVIDIA integrate API.
 
Last edited:

ripno

Member
Jan 27, 2023
132
335
131
Just to clarify my earlier message:

The 400 error I’m seeing with SLR Translator seems to come from the request payload format, not from JavaScript itself. Your `fetch` example works because it only sends the basic fields. The SDKs (and possibly SLR Translator) add extra fields like `chat_template_kwargs`, `extra_body`, or `response_format`, which NVIDIA’s endpoint appears to reject.

That matches what Topdod found in Translator++ — changing `response_format` from `'json_schema'` to `{ "type": "json_object" }` stopped the 400 errors. So the difference is really about which fields are included in the JSON body, rather than the language used.

If user use deepseek 3.1 terminus:
JavaScript:
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: '$NVIDIA_API_KEY',
  baseURL: 'https://integrate.api.nvidia.com/v1',
})

async function main() {
  const completion = await openai.chat.completions.create({
    model: "deepseek-ai/deepseek-v3.1-terminus",
    messages: [{"role":"user","content":""}],
    temperature: 0.2,
    top_p: 0.7,
    max_tokens: 16384,
    chat_template_kwargs: {"thinking":true},
    stream: true
  })
 
  for await (const chunk of completion) {
        const reasoning = chunk.choices[0]?.delta?.reasoning_content;
    if (reasoning) process.stdout.write(reasoning);
        process.stdout.write(chunk.choices[0]?.delta?.content || '')
   
  }

}

main();
If user use llama-3.3-nemotron-super-49b-v1.5
JavaScript:
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: '$NVIDIA_API_KEY',
  baseURL: 'https://integrate.api.nvidia.com/v1',
})

async function main() {
  const completion = await openai.chat.completions.create({
    model: "nvidia/llama-3.3-nemotron-super-49b-v1.5",
    messages: [{"role":"system","content":"/think"}],
    temperature: 0.6,
    top_p: 0.95,
    max_tokens: 65536,
    frequency_penalty: 0,
    presence_penalty: 0,
    stream: true,
  })
  
  for await (const chunk of completion) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '')
  }
 
}

main();