My previously introduced string associative table, the L2 library Si map, was leaving much of the problem space unaddressed. In particular, it was not providing a suiting coding for the internal regions of the string trie (or digital tree if you much prefer) of low node fanout. Those regions were covered by nodes of a coding better suited for mid to high fanout. The algorithmic/modelling deficiency was not for lack of problem recognition, but due to the difficulty of designing a solution that aims for both completeness and optimality. I haven’t managed to make much progress with designing such a solution, but to improve on performance I have decided to nonetheless do something, as I figured anything would be better than nothing.
The new L2 Si features two new types of nodes, one that selects a continuation with two or three bytes (and is thus better suited for low fanout parts of the trie) and a bitmapped one, that bridges the characteristics of existing mid to high fanout nodes.
The insertion is sped up across the board on the selected data sets, with the multi bytes selection nodes accounting for most of it where the keys are long, and the bitmapped nodes having a contribution for short keys (and consequently dense) data sets.




The memory usage stays largely the same.




One thought on “String Associative Tables, III”