wj

What is it?

The window join (wj) is much like the aj in that records are joined from one table to another 'as of' the time in one of the tables. However rather than matching one record from each table, the wj matches mulitple records from one of the tables using a time range rather than a specific time, and aggregates the results into one record when joining them.

For example, if you wanted to get the prevailing bid price at the time of a particular trade, you could use an aj. If you wanted to get the average bid price in the preceding 5 minutes of a particular trade, you could use a window join - this is an aggregation of the values over a particular time period.

The syntax is as follows:

wj [w; c; t; (q; (f0;c0); (f1;c1))]

where:

Returns for each record in t, a record with additional columns c0 and c1, which are the results of the aggregation functions applied to values over the matching intervals in w.

An example is:

wj[w; `sym`time; trade; (quote;(max;`ask);(min;`bid))]

This would join two columns to the trade table, ask and bid, which would represent the max ask and min bid values in each time window given in w for each trade.

Examples and Use Cases

Let's stick with the example and say we want to calculate the max ask and min bid in the 5 minutes prior to each trade.

So let's create a table of quotes that have been received throughout the day, and a table of trades that have been placed throughout the day.

n:900;

trade:`time xasc ([]time:("p"$2022.01.01) +1000*1000000*n?n;sym:raze 300#/:`MST`APPL`FB;side:n?`B`S;price:(n?1.0)+(300?1000.0+til 200),(300?100.0+til 20),(300?10.0+til 2));


n:900000;

quote:update askPrice:bidPrice*1.10 from `time xasc ([]time:("p"$2022.01.01) +1000*1000000*n?n;sym:raze 300000#/:`MST`APPL`FB;bidPrice:(n?1.0)+(300000?1000.0+til 200),(300000?100.0+til 20),(300000?10.0+til 2));

Calculating the time window

The first thing that will need to be calculated is the time intervals. The time intervals that we want for each trade are:

If the time of the trade was 2022.01.01D00:00:00.000000000 then we can get these values as follows:

Although it may seem redundant to add the last part, if we retain that then this can be expressed as:

q)2022.01.01D00:00:00.000000000 -/:(00:05:00;00:00:00)

2021.12.31D23:55:00.000000000 2022.01.01D00:00:00.000000000

Or alternatively:

q)2022.01.01D00:00:00.000000000 +/:(-00:05:00;00:00:00)

2021.12.31D23:55:00.000000000 2022.01.01D00:00:00.000000000

Now the single timestamp can be replaced by the list of all timestamps in the trade table:

q)w: trade.time +/:(-00:05:00;00:00:00)

q)w

2021.12.31D23:55:01.000000000 2021.12.31D23:55:01.000000000 2021.12.31D23:55:..

2022.01.01D00:00:01.000000000 2022.01.01D00:00:01.000000000 2022.01.01D00:00:..

Although this is usually expressed in the reverse:

q)(-00:05:00;00:00:00) +\: trade.time

For each record in the trade table, this gives us a pair of timestamps, start and end, (T-5m) to (T). These are the time windows we will use in the window join.

Using the aggregation functions

The aggregation function will be given the matching values in the selected column in list format. So the function you use must be able to operate on that list. 

Instead of providing a function, you can provide the identity function (::) which will instead return all of the values that would be passed into the function.

For example:

q)wj[w;`sym`time;trade; (quote;((::);`bidPrice))]

time                          sym side price    bidPrice

-----------------------------------------------------------------------------------

2022.01.01D00:00:01.000000000 MST S    1059.794 ,1103.515

2022.01.01D00:00:01.000000000 MST B    1012.967 ,1103.515

2022.01.01D00:00:02.000000000 MST S    1087.673 1103.515 11.75801 1172.165

2022.01.01D00:00:04.000000000 MST S    1143.472 1103.515 11.75801 1172.165 1143.734

This gives us a better picture of the values that will be passed into the function. As this is just a list, we can expect to use the 'max' and 'min' functions without issue:

q)wj[w;`sym`time;trade; (quote;(min;`bidPrice);(max;`askPrice))]

time                          sym side price    bidPrice askPrice

-----------------------------------------------------------------

2022.01.01D00:00:01.000000000 MST S    1059.794 1103.515 1213.867

2022.01.01D00:00:01.000000000 MST B    1012.967 1103.515 1213.867

2022.01.01D00:00:02.000000000 MST S    1087.673 11.75801 1289.381

2022.01.01D00:00:04.000000000 MST S    1143.472 11.75801 1289.381

Note that the column names returned are the same as the column names provided as inputs to the functions.

Let's take a look at the quote table to confirm that the values are correct:

q)5#quote

time                          sym  bidPrice askPrice side

---------------------------------------------------------

2022.01.01D00:00:00.000000000 MST  1103.515 1213.867 S

2022.01.01D00:00:01.000000000 FB   11.75801 12.93381 S

2022.01.01D00:00:02.000000000 MST  1172.165 1289.381 B

2022.01.01D00:00:04.000000000 MST  1143.734 1258.107 B

Yes - even at 00:00:04 the wj used the maximum ask price from the prior 5 minutes.

Performance

Window join is slow if the correct steps have not been taken to sort table q i.e. the quote table. The quote table should be sorted on `sym`time and have the parted attribute applied to the sym column. See the difference below:

q)\ts wj[w;`sym`time;trade; (quote;(max;`bidPrice))]

6524 5260512

q)\ts wj[w;`sym`time;trade; (`sym`time xasc quote;(max;`bidPrice))]

51 41944496

q)\ts wj[w;`sym`time;trade; (update `p#sym from `sym`time xasc quote;(max;`bidPrice))]

43 41944880

Alternative options

wj1 works in the same way as wj but with one difference: with wj, the last quote before the start of the window (the 'prevailing quote on entry to the window') is included, whereas in wj1 it is not included. So wj1 strictly only performs the window join with quotes inside the window. 

q)quote

time                          sym bidPrice askPrice

---------------------------------------------------

2022.01.01D00:09:59.999000000 MST 1198.12  1290.187

2022.01.01D00:10:00.000000000 MST 1197.273 1317

2022.01.01D00:10:10.000000000 MST 1113.781 1225.159


q)trade

time                          sym side price

-----------------------------------------------

2022.01.01D00:10:00.000000000 MST B    1153.621

2022.01.01D00:10:01.000000000 MST S    1076.986

2022.01.01D00:10:11.000000000 MST B    1157.908


q)w: trade.time +/:(-00:00:10;00:00:00)


q)wj[w;`sym`time;trade; (quote;(max;`bidPrice))]

time                          sym side price    bidPrice

--------------------------------------------------------

2022.01.01D00:10:00.000000000 MST B    1153.621 1198.12

2022.01.01D00:10:01.000000000 MST S    1076.986 1198.12

2022.01.01D00:10:11.000000000 MST B    1157.908 1197.273


q)wj1[w;`sym`time;trade; (quote;(max;`bidPrice))]

time                          sym side price    bidPrice

--------------------------------------------------------

2022.01.01D00:10:00.000000000 MST B    1153.621 1198.12

2022.01.01D00:10:01.000000000 MST S    1076.986 1198.12

2022.01.01D00:10:11.000000000 MST B    1157.908 1113.781

Performance considerations

q)\ts res1:wj[w;`sym`time;trade; (quote;(max;`bidPrice))]

4373 5260864


q)quote:update `p#sym from `sym`time xasc quote


q)\ts res2:wj[w;`sym`time;trade; (quote;(max;`bidPrice))]

2 75744


q)res1~res2

0b