I’ve been working on a little Websockets chat app, and wanted a way to separate multiple strings. Instead of using commas ‘,’ an escaped comma if the string needs a comma ‘\,’ and double backslash if I just need an actual backslash ‘\\’.
What we could do, is include the string’s length before the value.
But what if we have strings longer than 9 characters? we could use more digits like this:
But how far do we go? how about instead of having a fixed number of digits, we use the maximum digit to show there is more length information. We can keep chaining this to add more.
1 = 1 8 = 8 90 = 9 + 0 = 9 91 = 9 + 1 = 10 990 = 9 + 9 + 0 = 18 995 = 9 + 9 + 5 = 13 999996 = 9 + 9 + 9 + 9 + 9 + 6 = 51
This is nice and short for small strings, but longer for longer strings. This data is typically short, and (as a percentage) the length is still relatively short.
We could use something better than base-10 for these lengths. Hexadecimal (base-16)would look like this:
5helloBPirate Ship FFDThe quick brown fox jumps over the lazy dog
Even better, we could use base-32
5helloBPirate Ship vcThe quick brown fox jumps over the lazy dog
This seems to work nicely. The majority of strings for my application are short, for example a timestamp is “d1585917453947”
Of course, this is purely academic, there are much better and more proven ways to do this. You could use commas and escaped characters, a non-typeable separator, json or even protocol buffers.