I’ve been working on a little Websockets chat app, and wanted a way to separate multiple strings. Instead of using commas ‘,’ an escaped comma if the string needs a comma ‘\,’ and double backslash if I just need an actual backslash ‘\\’.
hello,world!
What we could do, is include the string’s length before the value.
5hello6world!
But what if we have strings longer than 9 characters? we could use more digits like this:
05hello11Pirate Ship
But how far do we go? how about instead of having a fixed number of digits, we use the maximum digit to show there is more length information. We can keep chaining this to add more.
1 = 1
8 = 8
90 = 9 + 0 = 9
91 = 9 + 1 = 10
990 = 9 + 9 + 0 = 18
995 = 9 + 9 + 5 = 13
999996 = 9 + 9 + 9 + 9 + 9 + 6 = 51
This is nice and short for small strings, but longer for longer strings. This data is typically short, and (as a percentage) the length is still relatively short.
5hello92Pirate Ship
We could use something better than base-10 for these lengths. Hexadecimal (base-16)would look like this:
5helloBPirate Ship
FFDThe quick brown fox jumps over the lazy dog
Even better, we could use base-32
5helloBPirate Ship
vcThe quick brown fox jumps over the lazy dog
This seems to work nicely. The majority of strings for my application are short, for example a timestamp is “d1585917453947”
Of course, this is purely academic, there are much better and more proven ways to do this. You could use commas and escaped characters, a non-typeable separator, json or even protocol buffers.