Tool Sugoi - a translate tool with offline AI-powered model to translate from Japanese; DeePL competitor

Shisaye

Engaged Member
Modder
Dec 29, 2017
3,000
5,357
I think the releases starting from toolkit V2.5 went from V2.5 - Offline V3 , Levi - Patreon only , Offline V4 - Patreon only , V3(Patreon only) - Offline V4 , V4 - Levi , and now V5 - Offline V4.
That's not confusing at all. :Kappa:

Well at least I know now. Thank you.
 

Zippix

Well-Known Member
Sep 7, 2017
1,682
1,122
That's not confusing at all. :Kappa:

Well at least I know now. Thank you.
Basically he is still reserving the v4.0 offline MODEL for the patrons (who have early access to v5.0 Sugoi Toolkit with the v4.0 model bundled in, since early March), and what you can download from his patreon without any hussle is the (public) v4.0 Sugoi TOOLKIT, which has the Levi offline MODEL in it for offline translations.
 
  • Like
Reactions: GoldenBoy_69

GoldenBoy_69

New Member
May 23, 2019
4
1
Basically he is still reserving the v4.0 offline MODEL for the patrons (who have early access to v5.0 Sugoi Toolkit with the v4.0 model bundled in, since early March), and what you can download from his patreon without any hussle is the (public) v4.0 Sugoi TOOLKIT, which has the Levi offline MODEL in it for offline translations.
Damn, all the model and version of Sugoi are confusing as hell...

Btw, can I install the offline-translator only without any other things in the toolkit? Full package is like 6 gigs, that's quite big for my little SSD.
 
Last edited:

runza

New Member
Apr 15, 2017
2
4
Gonna shed my lurker status just this once to give something back for AMD linux users.

Now that Arch finally gained ROCm support and my 6700 XT is semi supported I will share my scripts to create a "small" fairseq appimage with pytorch ROCm. It's relatively simple and straight forward so just go read the README and read/execute "recreate_fairseq" in the same directory as the folder "Sugoi-Translator-Toolkit-V4.0-Public".

Install instructions and requirements are in the README.



Some things to note:
Many consumer GPUs need to pretend to be another one to work, just like my 6700 XT.
That is done by exporting HSA_OVERRIDE_GFX_VERSION=10.3.0. "10.3.0" means I'm pretending to be gfx1030 aka 6700 which is supported in rocm.
"rocminfo | grep gfx" shows you what you have or are pretending to be, even if they are not working. Researching this is a pain in the ass so you are on your own.

This would also be a good starting point if you want to hack your own ctranslate2 native build or just the normal one.
 

keiii13

New Member
Jan 9, 2020
2
0
Gonna shed my lurker status just this once to give something back for AMD linux users.

Now that Arch finally gained ROCm support and my 6700 XT is semi supported I will share my scripts to create a "small" fairseq appimage with pytorch ROCm. It's relatively simple and straight forward so just go read the README and read/execute "recreate_fairseq" in the same directory as the folder "Sugoi-Translator-Toolkit-V4.0-Public".

Install instructions and requirements are in the README.



Some things to note:
Many consumer GPUs need to pretend to be another one to work, just like my 6700 XT.
That is done by exporting HSA_OVERRIDE_GFX_VERSION=10.3.0. "10.3.0" means I'm pretending to be gfx1030 aka 6700 which is supported in rocm.
"rocminfo | grep gfx" shows you what you have or are pretending to be, even if they are not working. Researching this is a pain in the ass so you are on your own.

This would also be a good starting point if you want to hack your own ctranslate2 native build or just the normal one.
hows the performance of this? like how fast on average rocm is taking to translate a line?
 

runza

New Member
Apr 15, 2017
2
4
hows the performance of this? like how fast on average rocm is taking to translate a line?
It's around 130ms on average. I managed to make it slow by accidentally spamming it with unfiltered textractor output and it manifested by just chugging along slowly until it was finished. At least it didn't crash yet and only consumed more vram so that's good. I wonder if it profits from resizable bar/sam?
 

nerdman83

Newbie
Mar 12, 2019
73
82
On the Offline 4.0, it's well worth installing it. For the most part, download the latest public Translator Toolkit off the creator's patreon (on the About page) and then paste in the Offline 4.0 files from here.


There's a few things Offline 4.0 gets tripped up on, but Offline 4.0 is vastly superior to DeepL for HGames.


I was translating a game in Translator++ with DeepL and Sugoi side-by-side and I'd say 8 out of 10 times Sugoi Offline 4.0 produces a more readable, succinct result. DeepL produces a result but it's usually more wordy. Sometimes it's superior in vocabulary, but more often Sugoi produces a more correct result in intention and pronouns. The things Sugoi messes up on DeepL messes up on as well.
 
Last edited:

nerdman83

Newbie
Mar 12, 2019
73
82
As far as Public (Levi model) vs Offline-4.0

Empirically I'd say in 70-80% of the situations Levi matches Offline-4.0. There's maybe 5% of the situations where Levi actually produces a little better result (mostly just short words / noises), and 20% of the time it produces a wordier result, or mangles commas such that the translated phrase, while being word-correct, loses its meaning.

Offline-4.0 does a very good job with punctuation.

The biggest negative about Offline-4.0 is sometimes with screaming phrases it will barf and return 250+ characters repeating. But it's very obvious when it breaks and easy to fix. Levi doesn't seem to have this bug. DeepL sometimes will skip translating entire parts of sentences, I haven't seen that happen with Offline or Levi.

Example
Original: 股間を両手で押さえつつ、小刻みに身体を震わせ始めた
Offline-4.0: Holding her crotch with both hands, her body began to tremble slightly.
Levi: Clutching her crotch with both hands, her body began to tremble slightly.
DeepL: While holding his crotch with both hands, he began to shake his body in small increments.
Bing: Holding her crotch with both hands, she began to wiggle her body.
Google: While holding his crotch with both hands, he began to shake his body little by little.


The only real negative is both struggle sometimes with ownership of nouns (i.e. "my penis" instead of "his penis"). Offline-4.0 seems to struggle with this less gets it right 50% of the time Levi fails. DeepL / Bing / Google don't usually fail here since they go for a wordier literal interpretation (i.e. "the penis") that often results in a hard-to-read sentence.
 
Last edited:

nerdman83

Newbie
Mar 12, 2019
73
82
Sugoi v6 was released which now contains the famed Offline v4 model as standard (for free).





CUDA installer for Toolkit v6 (NOTE: CT2 is broken in this Installer script)



Mirror CUDA installer:

 
Last edited:

-FibaG-

Member
Nov 9, 2018
272
434
Sugoi v6 was released which now contains v4 model as standard.

Change/Adding:
V6.0:

  • Sugoi ASMR Translator Default translation model is now V4, the most advanced model
  • Added skeleton code for Sugoi Translator Premium
  • Organized the main menu window to pick program easier
  • Included important info for Sugoi Translator cmd log (time benchmark, input text, output translation) Added a sample input folder for user to conveniently test programs

Meh is nothing new about offline translation, I was hoping some more update in that area.
 

nerdman83

Newbie
Mar 12, 2019
73
82
Change/Adding:
V6.0:

  • Sugoi ASMR Translator Default translation model is now V4, the most advanced model
  • Added skeleton code for Sugoi Translator Premium
  • Organized the main menu window to pick program easier
  • Included important info for Sugoi Translator cmd log (time benchmark, input text, output translation) Added a sample input folder for user to conveniently test programs

Meh is nothing new about offline translation, I was hoping some more update in that area.

No, there's nothing really new, just that he's giving away v4 in the free public release instead of Levi.

Per his discord, in whatever metrics of translation accuracy he's using, v4 rates highest.


My personal experience is that v4 is waaaay better than deepl for smut games. As for Levi vs v4 I've commented earlier 80-90% of the time v4 is better but occasionally v4 barfs and you have to blend the two. The other 80% of the time v4 is significantly more grammatically correct than Levi (levi is wordier).



1688149931814.png
 

FMID

Member
Jun 2, 2017
113
154
Currently using the Offline translator and it's very, very good with the right VN's but I'm having issues with H-Scenes. It's tripping up when moans/screams are being translated and the rest of the sentence is just a repeated letter. If anyone knows of any way to fix this then it would be much appreciated.
 

nerdman83

Newbie
Mar 12, 2019
73
82
Currently using the Offline translator and it's very, very good with the right VN's but I'm having issues with H-Scenes. It's tripping up when moans/screams are being translated and the rest of the sentence is just a repeated letter. If anyone knows of any way to fix this then it would be much appreciated.
I noted that previously. There's no fix that I'm aware of (perhaps there is and I don't know about it).

You can get around it by in the Textractor main window by highlighting the rest of the sentence and it will immediately translate whatever you highlighted.
 
  • Like
  • Thinking Face
Reactions: Zippix and FMID

nerdman83

Newbie
Mar 12, 2019
73
82
So there is a parameter that fixes the repeat, but it makes the translator 2x slower so you really should combine it with either the CT2 patch or, better, the Nvidia CUDA patch.

As mentioned CT2 speeds up translation speed ~4-5x and CUDA speeds it up 10x.

NOTE: If I understand right, the CT2/CUDA mod already has this installed, as long as you activate from the Sugoi-Translator-Offline-CT2 (click here).bat script.


Edit flaskserver.py in .\Code\backendServer\Program-Backend\Sugoi-Japanese-Translator\offlineTranslation\fairseq


1688567648646.png
 
Last edited:
  • Like
  • Thinking Face
Reactions: Zippix and FMID

FMID

Member
Jun 2, 2017
113
154
Thank you for your response. I'll try the edit and see how I fare. I'm shocked at how well this translator is coming along, it blows everything else out the water from what I've tested.
 

nerdman83

Newbie
Mar 12, 2019
73
82
Thank you for your response. I'll try the edit and see how I fare. I'm shocked at how well this translator is coming along, it blows everything else out the water from what I've tested.
I can confirm it fixes the issue.

I've been having the issue too, so it was nice to finally get it solved.


Now I have to go re-translate my private stash of games (I eventually intend to post them). Your standards shift and there's still a delay waiting for Sugoi to translate games so eventually you go down the rabbit hole of decomposing game files and translating the files directly.
 
Last edited:

shiny-kuki

Member
May 6, 2020
298
230
how much slower is it with the repeat fix?
hmm, I tested with and without the fix and there was not much difference in speed
 
Last edited:

nerdman83

Newbie
Mar 12, 2019
73
82
how much slower is it with the repeat fix?
hmm, I tested with and without the fix and there was not much difference in speed
It's probably not 2x time, but there is more GPU usage so I can tell it's using more power.

With the GPU version, the logs seem to indicate 0.3 seconds where I was getting 0.2 seconds before.


I know without the GPU/CT2 mods the translation speed is more like 1.0 - 2.0 seconds so it may not seem like longer compared to that. It tells you the translation speed in the CMD window.