Hello, I just had a few questions about how you got your SLR Translator to work and about things in the Repackage.
First, to run a working SugoiV4 server locally, you have to click and run the "activateOfflineTranslationServer.bat" file, correct?
Second, as a NVIDIA user with more than 8GB of RAM, do I extract "Sugoi-Repackage-CUDA.2024-05-21.7z" or "Sugoi-Repackage-CUDA-Full.2024-05-20.7z" with 7-zip? Also, should I use Fairseq or CTranslate2?
Lastly, in the Sugoi Toolkit V8.0, there in an "Instructions" folder with a "Basic Guide" text file. In it, there is a line that reads "For offline mode, I replace the big model (2.5GB) with the base model (900MB) so the program is lighter." Is the Sugoi Repackage related to this and should I replace the current basic model with the big model? Are the benefits to doing so?
I apologize for this long post! I am just a bit unfamiliar with these more technical setups and was just looking for a bit of clarification!
For the official toolkit, that deeply buried activateOfflineTranslationServer.bat file should work to run the official server offline, yes. It is very slow and has a lot of other issues.
If you have an Nvidia GPU, then Sugoi-Repackage-CUDA is the preferred package. The regular and Full packages are the same versions except that the setup is already complete in the CUDA-Full package and it contains both models which is why it is larger. By default, PyTorch 2.2.1 and CUDA 11.8 are used.
Since setup is not complete in the "Sugoi-Repackage-CUDA" version, it allows you to verify exactly what software will run and which versions of what will be used when you click on the installation .bat by looking at the source code, the .bat file, and scanning the included Python environment with your AV or using your own Python environment. The included Python version is unmodified from the standard one besides adding Win7 compatibility and everything is open source. The Full version has more files which makes it harder to verify its trustworthiness. I created them and I run them on my own computer, so I can verify they are clean, but it is always better to verify it yourself.
If you want to save some time downloading stuff and if you already have the Sugoi offline translator v4 model from the sugoi toolkit v8, then copy it to the ~sugoi_v4model_fairseq folder instead of redownloading it. It is in the toolkit at SugoiToolkit\Sugoi Translator Toolkit\Code\backendServer\Program-Backend\Sugoi-Japanese-Translator\offlineTranslation\fairseq\japaneseModel\* . Be sure to copy the two *.txt files too. and also copy the contents of "spmModels" into a folder named "spm" as "sugoi_v4model_fairseq\spm\*.model" and "sugoi_v4model_fairseq\spm\*.vocab There is also a script under apps\scripts\fairseqToCTranslate2_converter.Sugoi.bat that helps convert that model to the CTranslate2 format so the CTranslate2 . See the
You must be registered to see the links
for instructions on how to use it, but basically you copy it to the toolkit and it converts the PyTorch/Fairseq model into the CTranslate2 format.
For CUDA users, the Fairseq model is preferred over CTranslate2. For CPU, use the CTranslate2 model.
In the instruction folder of Sugoi toolkit when it mentions the 900mb model, it is referring to the whisper model for audio->text, not the offline translator v4 model that does Japanese->English translation. For ASMR, the larger model is better than the smaller one.
SLR Translator and the repackage only only use the Japanese->English model. For Japanese->English translation, the Sugoi Offline Translator v4 NMT model included in Sugoi Toolkit v6-v8 is the latest and best model available. Do not change it. However, there are better software options to run that model which features like cache, not using resources when idle, optimized for batch translations and so forth, which is why the repackage exists.