Before I get on with why the banchrmark itself is incorrect and gives scewed results, I'll first point out this:
Your code is unreadable. As I mentioned, the way I write code tries to strike a balance between human and machine readability. While I will occasionally represent thing in the way you did there, I prefer to avoid it whenever possible. However, your code goes out of its way to obfuscate the relation between the input and output of the function, in regards to human readability.
I agree, however readability was never been goal of this example, just to show that it is more efficient, and because of that I did minimal example without any changes in logic.
Why it is so unreadable?
- There is unneccessary complex logic in 'determine-height-difference', even after creating truth table there is small bug in my version (but it doesn't affect performance). Code is maybe more readable, but relation between input and output is not.
- There is no multi-dimensional arrays in engine, and you probably don't want to add them.
- Probably it can be written in more readable way, but it is not worth it for benchmark.
But that's not the issue. The issue is that your benchmark is correct... but your implementation of the state it self is incorrect and it happens right here:
Code:
createsag "const-height-map" new{ ... }
To optimise the execution, you yeeted part of your solution into the game state, at global level, which is something we don't do. The game state is not there for bits and pieces of date to be cached there local data stays in the local memory space. If you instead did it the way it should be done and moved the table into your state, you'd find that in the best case scenario your way of doing it is, as I said, not faster than mine and in the worst state is indeed significantly slower.
You correct about my intention, but not about result. I left it that way, because I prefer to have constants on global level, it doesn't affect performance (maybe a bit slower than global variant, but still faster than original).
determine-height-difference2.state:
Code:
//cg-tt-npc-height - height of NPC, either short, average or tall;
//cg-tt-height-diff - height difference of PC, either shorter, equal or taller;
createl "has-highheels" no
if =s fromarray var "cg-inv-equipped" 7 new: jump skip-shoes
setl "has-highheels" containss strsplit decorate "[data\items\[cg-inv-equipped@7]$28]" "," "type-highheel"
marker skip-shoes;
sets "cg-tt-height-diff2" fromarray new{ "impossible" "impossible" "impossible" "equal" "shorter" "shorter" "taller" "taller" "taller" "equal" "taller" "taller" "impossible" "impossible" "impossible" "taller" "equal" "equal" "taller" "taller" "taller" "taller" "taller" "taller" } + + + * var has-highheels 12 * lengths intersect cvar "atashi" "build-adj" filestr "automata\tools\conditions\is-tall" 6 * lengths intersect cvar "atashi" "build-adj" filestr "automata\tools\conditions\is-short" 3 indexofs new{ "short" "average" "tall" } var "cg-tt-npc-height"
pop
Has same speed as with global array (faster than FSA). And it always will be (you can slowdown array access or add some optimizations for jump only code only, but I don't see reasons for that).
Baseline is always will be faster, at least with current computer architecture. Why? Random memory access is always significantly slower than linear memory access (which is result of complex interactions between cpu caches, branch predictor, prefetch, batched memory read and other stuff, however it is probably easier than Height Difference Determination Problem), any FSA with 'JUMP' inside 'JUMP' will create two random memory access, when single array read causes only one random read. You can easily slowdown performance of array access by using nested arrays for example (arr[1][2][3]), instead flat array. If done that way it will be same as FSA. And you can also optimize FSA by computing jump tables for 2-3 labels depth, which in the end will create same matrix.
I'm still fairly impressed with what you've shown here but, as you yourself said:
And that is undebatable, yes. However most of my objections to what you've been suggesting were always related to the lack of low level understanding of the engine.
Anyway, this was an interesting encounter. If you feel like talking more about it, though, I'd suggest maybe my Discord?
Low level things is implementation details, its not like there is algorithmic reasons which prevents them (I'm pretty sure that it is possible to implement them), however they may require some changes in engine which you don't want, which is ok.
I will try to connect to discord, but probably will take some time, I don't know anyhting about it.