Tickerplant Log Files

What is it?

The tickerplant writes a copy of every update it has received throughout the day to a file on disk. This file is the logfile (sometimes the 'journal').  It serves two purposes:

The file is initialised on startup by using the following command (where L/u.L is the filepath of the logfile):

.[L;();:;()] 

Which is the same as the following command:

L set ()

The tickerplant maintains an internal count of the number of records that should be replayed from the logfile in .u.i. 

Note: when in batch mode, this doesn't necessarily match the number of records actually in the logfile, because the tickerplant logs each and every update to the logfile but only publishes updates at intervals. A process may connect to the tickerplant between intervals, in which case it only needs to replay up to the last interval because the tickerplant will be publishing the latest interval shortly. The tickerplant uses an additional variable, .u.j to handle this when in batch mode. 

The tickerplant writes updates to the file by opening a filehandle to the filepath and sending updates over that handle. This update usually takes the following form:

l enlist (`upd;t;x)

As such, if you look at the content of the logfile, it is essentially a big list of (command;table;data) updates, with `upd more than likely always the command used. 

What does it look like?

If you tried to open a tickerplant logfile directly on disk, it would look something like this:

ÿ^A^@^@^@^@^@^@^@^@^C^@^@^@õupd^@õtrade^@^@^@^F^@^@^@ôH^Q!ý^^e<87>  õl^@õEX1^@õB^@÷<8f>Âõ(\I<93>@÷®^?×gÎúÕ?...

The file is a kdb+ binary file so can only be read by a kdb+/q process. We must use a 'get' command on the file to open it:

q)get `:sym2021.10.04

`upd `trade (2021.10.04D02:10:51.967517000;`l;`AAPL;`B;1234.34;0.343433)      ..

`upd `trade (2021.10.04D02:11:33.050349000;`l;`AAPL;`B;1234.34;0.343433)      ..

 This looks more like what we were expecting (although this reads the entire logfile into memory so can be risky).

Replaying data naïvely 

The first item could be taken from this logfile and when used with the 'value' command, would run the function 'upd' with the input of 'trade' and the trade data for `AAPL. If there was a 'upd' function defined in the q process, it would be ran in exactly the same way as if the update had came from a tickerplant over a handle. 

This is essentially how a logfile is replayed, take each update from the file, in order, and process them through the usual update function. 

So, if we wanted to replay the entire logfile, we might run something like:

value each get `:sym2021.10.04

This would not be a good idea. The command first runs a 'get' on the entire logfile, and then processes each item one by one. This will use a lot of memory. It would be more efficient to read each update into memory, process it, and then move on to the next update. Thankfully there is an internal function that does this.

Replaying tickerplant log files

-11! is the internal function for replaying log files message by message.  Each message is evaluated using '.z.ps' which defaults to 'value', similar to the example above. 

The most common usage of -11! is 

-11!(-1;x)

However this can be shortened to:

-11!x

 This usage will replay the logfile, processing all of the updates, and if successful will return the number of 'chunks' (or updates) executed. 

However, there may be some issues when replaying the logfile. This might particularly be an issue if the tickerplant failed at some point and was trying to replay the tplog file when restarting. The tickerplant may have been in the middle of writing to the file when it failed, resulting in a bad update and corrupted file

If this is the case, the function will return a 'badtail' error. 

Other possible errors include the function name if the function hasn't been defined in the process ('upd would most likely be the error thrown in this case, but note that it is also a licencing error)

Corrupt log file

If the tplog file is corrupt and replaying it has failed with a 'badtail' error, all is not lost. We can still recover 'up to' the bad message and may even be able to fix it.

To find out the first bad update in the file we can use the following command:

-11!(-2;x)

This will return two outputs: number of good updates in the file (i.e. the number of the last good update) and the number of replayable bytes (i.e. the last byte of data before the error)

Note: using the -2 usage on an uncorrupt file will just return the number of updates in the logfile.

Now we have this information, we can use the final usage of -11!:

-11!(n;x)

With this usage, we can provide a value as an input and the logfile will be replayed up to and including update n. So if the first 10,000 updates in the file were good updates, with update 10,001 being corrupt, we could provide 10,000 as input n and replay the first 10,000 updates. 

Fixing a corrupt log file example

The following log file returns a length error when replayed:

tplogCorrupt

Because it is a small logfile it is easy to get the entire file:

q)get `:tplogCorrupt

(`upd;`trade;`sym`price`size!(`AAPL`MSFT`GOOG;1.23 2.34 3.45;1000 2000 3000))

(`upd;`trade;`sym`price`size!(`GOOG`MSFT;3.45 2.45;2000 5000))

(`upd;`trade;`sym`price`size!(`AAPL`AAPL`GOOG;1.23 1.24 3.45;1000 1000 4000))

,`upd

,`trade

+`sym`price`size!(,`GOOG`GOOG;,3.4 3.3;,2000 2500)

(`upd;`trade;`sym`price`size!(`AAPL`GOOG;1.23 3.45;1400 4400))

As can be seen, one of the updates was sent as three individual updates rather than a list of three items. This is causing the length error. 

Rather than just replaying up to the bad update, we should be able to fix this and replay the entire file.

This has been done below in a simple step by step fashion:

q)// Get the logfile and assign to a variable

q)workingLog:get `:tplogCorrupt

q)// Make a copy (just in case)

q)origLog:workingLog

q)// Combine the bad updates into one good update

q)goodUpdate:raze workingLog[3 4 5]

q)// Overwrite one of the bad updates with the good update

q)workingLog[3]:goodUpdate

q)// Remove the other bad updates

q)workingLog except workingLog[4 5]

`upd `trade `sym`price`size!(`AAPL`MSFT`GOOG;1.23 2.34 3.45;1000 2000 3000)

`upd `trade `sym`price`size!(`GOOG`MSFT;3.45 2.45;2000 5000)

`upd `trade `sym`price`size!(`AAPL`AAPL`GOOG;1.23 1.24 3.45;1000 1000 4000)

`upd `trade `sym`price`size!(`GOOG`GOOG;3.4 3.3;2000 2500)

`upd `trade `sym`price`size!(`AAPL`GOOG;1.23 3.45;1400 4400)

Comments