Hello. I'm facing a dilemma. I use a script bat UnRen or renpy-sdk for language parsing, but in some cases, depending on the quirks of the novel's developer, they don't extract the files to the tl folder. What if I add a separate parser to this program, with an input field for the desired language, but a more in-depth analysis is required. Is that possible? I would be grateful.
I faced the same issue myself: sometimes relying on standard decompilers (like UnRen) or parsing only .rpy files isn't enough.
I discovered that many text strings actually remain trapped inside the compiled .rpyc files and never make it into the decompiled .rpy. Even if you successfully decompile the game, some critical strings (especially dynamic ones in screens or init blocks) are simply missing from the source code. That's why I spent so much time implementing a native RPYC Reader directly into RenLocalizer.
So, yes! Extending the parser is definitely possible and recommended. Here are a few areas you might find helpful based on my experience:
- RPYC & RPYMC Reading: Instead of relying on an external unrpyc tool, writing a direct AST reader (using Python's pickle mechanics) allows you to extract all of the game's data, including those tricky hidden strings. My project does this in `src/core/rpyc_reader.py`. It reads both script bytecode and screen caches (.rpymc).
- Screen Language (SL2) Parsing: A lot of 'missing' text lives in complex screen definitions (buttons, tooltips, frames). A standard line-by-line parser misses these. You need an approach that understands Ren'Py's Screen Language structures.
- Technical Filtering: Parsing everything also means getting a lot of garbage code (like file paths or variable names). You'd need a robust filter (I use a symbol density heuristic) to distinguish between 'Dialogue' and 'Code'.
If you want to create a custom parser module, check out `src/core/parser.py` in the repo. It's designed to be modular, so you could easily plug in a 'Deep Analysis' logic there. You can use, adapt, or develop the current RPYC reading logic (`src/core/rpyc_reader.py`) as you see fit. I recommend using it as a basis for your in-depth analyses because writing something from scratch is really headache-inducing. Building on top of something open source is less of a headache. Also, once you're done, I'd love to take a look at your code—if you don't mind, of course. Maybe I'll discover a few tips that I can add to my program.
