Representing unsigned 64-bit integers in Kusto
I recently had an interesting situation at work. I use Kusto and write KQL often and we have a company-run Grafana instance with the Kusto datasource for users like me to visualization KQL queries in Grafana. It all works really well.
Recently, I had a situation where I had a event_id
field in a Kusto table. It was originally emitted to Kusto as an uint64
(unsigned 64-bit integer). However, Kusto can only support signed 64-bit numbers through the long
datatype. So this meant that when I queried this Kusto table for this column, it would represent the unsigned 64-bit integer as a signed 64-bit integer.
Unfortunately, not only could Kusto not properly represent unsigned 64-bit integers, Javascript (which is what Grafana uses to visualize numbers client-side) can only represent numbers up to 253 numbers safely. This meant that even if signed 64-bit integers were good enough for my use case, it wouldn't work rendering these numbers in Grafana because I could only safely represent 253.
Now I had the event_id
represented as a string
in another table, and I needed to join the two tables via event_id
. At this point, I came up with this crazy idea: What if I can turn the signed 64-bit integer representation of event_id
into a string representation of the unsigned version?
Hence began my journey.
I originally started off by trying all sorts of native Kusto functions to get this to work. Nothing ultimately worked, but I did find one thing: representing the event_id
column in its hex form was the best way to preserve the bit representation (Kusto also does not have a binary or byte array representation...).
So after trying a bunch of stuff, I ultimately was only able to get this:
EventsTable
| extend event_id_hex = tohex(event_id)
From here, the only thing I was able to get working was to bust out the Python plugin which allows me to run arbitrary Python code in my Kusto query:
EventsTable
| extend event_id_hex = tohex(event_id)
| evaluate hint.distribution = per_node python(
typeof(*, event_id_str:string),
\```
result = df
def convert_hex_to_str(hex_input):
try:
bytes_input = bytes.fromhex(hex_input.zfill(16))
except Exception:
return "ERROR: " + hex_input
return str(int.from_bytes(bytes_input, byteorder='big'))
result["event_id_str"] = df.event_id_hex.apply(convert_hex_to_str)
\```
)
| project event_id, event_id_str
This was functional!
However, it was pretty slow and I eventually decided on just adding a new column to the table that was the string representation of the event_id
so I didn't have to do this crazy KQL manipulation just so I can get the correct join key.
But it was a fun exercise. 👍🏻