Terminology, Syntax and Notation for Iterators
Why?
You can do many things in kdb without learning all of the terminology, syntax and various forms of notation. However when it comes to iterators, you can quickly get out of your depth without a broad understanding in these areas.
Note: As always the best and most thorough information on this topic can be found on the code.kx website including the Q for Mortals section. The following is an attempt to summarise the most useful parts when it comes to operators, and it is mostly not in the author's own words but has been copied directly. Links to the original text have been supplied as much as possible.
Iterator Glossary
Functions: A mapping from input/s to result defined by an algorithm. Operators, keywords, compositions, projections and lambdas are all functions.
Values: Everything in q is a value, and almost all values can be applied:
A function (operator, keyword, or lambda) can be applied to its argument/s to get a result.
Application: To apply a value means:
to evaluate a function on its arguments
The syntax provides several ways to apply a value.
Applicable values: An object that can be applied to its argument/s or index/es. Functions of all kinds are all applicable values.
An applicable value is a mapping. A function maps its domains to its range.
Valence: The number of inputs or arguments to a function
Unary/Binary/Ternary: A function of rank X, i.e. a function that takes X arguments, or a list of depth ≥X. (The terms monadic, dyadic, and triadic are now deprecated.)
Variadic: Function of variable rank. For example the operator @
Operator: A primitive binary function that may be applied infix as well as prefix, e.g. +, &.
Argument: The input(s) to a function.
In the expression 10%4 the operator % is evaluated on the arguments 10 and 4. 10 is the left argument and 4 is the right argument.
By extension, the first and second arguments of a binary function are called its left argument and right argument regardless of whether it is applied infix. In the expression %[10;4] 10 and 4 are still referred to as the left and right arguments.
By extension, where a function has rank >2, its left argument is its first argument, and its right arguments are the remaining arguments.
Atomic function: The application of an atomic function is characterized by the fact that it recurses into an argument’s structure until it gets to atoms and then applies there. It does this without explicit loops or other control flow constructs, e.g. neg[1 2 3 4] will neg each item in the list, there is no need to use the Each iterator here i.e. neg'[1 2 3 4]. A function of any rank can be an atomic function, examples:
Unary: One function applied to either an atom or a list (and therefore each item of the list)
Example: neg as shown above
Binary: One function applied to two inputs.
As each input can be an atom or a list, this results in the following possible combinations:
atom with atom
atom with list
list with atom
list with list
In the first two cases, it can be considered as a unary atomic function by considering the non-atomic argument as one argument
In the last case, the two list operands must have the same length.
Higher order function: A function that operates on other functions is called a higher-order function. Higher-order functions are a powerful concept from functional programming.
Derived function: An iterator applied to an applicable value derives a function, e.g. +/
Application Syntax (for functions)
As defined above, in relation to a function, 'application' is evaluating a function on its arguments.
This is something that is quite intuitive even to those with no prior background in kdb/q e.g.:
q){1+x} 2
3
Here we have applied the function {1+x} on its argument 2.
But there are multiple ways to perform application:
Bracket application
The basic form of applying a function to its arguments. All functions can be applied with bracket notation.
Examples:
q){x+1}[1]
2
q)count[1 2 3]
3
Bracket notation can be used on any function, even those that would normally be considered as 'infix' (see #4 below) functions, e.g.:
q)+[1;2] ~ 1+2
1b
This kind of notation is used quite extensively in kdb/q code.
Prefix syntax
Unary keywords and lambdas can also be applied prefix. This means that the function is written to the left of the arguments. For example:
q){x+1}1
2
q)count 1 2 3
3
Apply/Apply At Application
As it turns out, both the bracket and prefix syntax do the same thing 'under the hood' - they use general application which can be used directly using the Apply ('.') or Apply At ('@') higher-order functions.
Apply At
Keeping the same examples:
q){x+1} @ 1
2
q)count @ 1 2 3
3
These both use the Apply ('@') operator. The higher-order function @ is the true form of basic application in q. It applies a unary mapping to an argument. As with all built-in functions it can be written infix or prefix. As described above in the 'Atomic functions' definition, even though the function is unary there is no need to use an Each when applying it to more than one argument, as this will be handled automatically and it will take each input in order.
For lists, dictionaries and atomic functions, @ application yields an output that conforms to the shape of the input, e.g.:
q)neg @ (1 2 3; 4 5 6; 7 8 9)
-1 -2 -3
-4 -5 -6
-7 -8 -9
An aggregate function collapses the top level of nesting from the input e.g.:
q)sum @ (1 2 3; 4 5 6; 7 8 9)
12 15 18
A uniform function has output the same length as an input list:
q)deltas @ 1 2 3
1 1 2
As can be seen, prefix syntax is equivalent to simply omitting the Apply At operator, so in these cases the @ is mostly redundant.
Apply
Apply At only works on unary functions. Evaluating a function with multiple arguments is an instance of applying a multi-variate mapping. The higher-order function '.' is the true form of multi-variate application in q. It applies a multi-variate mapping to multiple arguments and can be written infix or prefix.
The right argument of '.' must be a list.
An example:
q){x+y} . 1 2
3
Side note: the following will throw an error:
q)+ . 1 2
In this case the operator must be wrapped in brackets i.e. (+). This is explained under 'Variadic Operators' here.
If the function being applied has rank 2, then the argument list must have 2 items and the function will be applied to the first argument and the second argument of that list.
This means that you can apply the function to lists of lists, e.g.:
q){x+y} . (1 1; 2 3)
3 4
This logic is extended to functions with rank 3, 4, 5 etc.
Side note: the @ operator is just an extension of the '.' apply function with the argument enlisted:
q)count @ (1;2;3)
3
q)count . enlist (1;2;3)
3
Infix syntax
Operators, and some binary keywords and derived functions can be applied infix. Infix describes the method of applying a function to its arguments in which the function is in the middle of the arguments, e.g.:
q)1+2
3
q)10 mod 3
1
Your own functions cannot be applied infix (well, technically they can) unless they are derived functions:
q)1{x+y}2
'type
q)1{x+y}/2
3
Postfix syntax
Postfix describes the method of applying a function to its arguments in which the function is to the right of the arguments. Iterators are the only functions that can be applied postfix and for an iterator, the 'argument' is the function it is being applied to. For example, the '/' iterator can be applied to the '+' operator to create the derived function '+/'.
As a rule, when a derived function is made in this way, it must be used with the infix notation or it will throw an error, i.e.:
q)+\ 1 2 3
'
[0] +\ 1 2 3
^
q)1 +\ 1 2 3
2 4 7
You will see this referred to as 'postfix yields infix'
However not all derived functions follow this rule, and in fact the syntax and valence of derived functions deserves its own explanation.
Derived Functions
A derived function is the result of an iterator applied to an applicable value such as a function or operator.
The syntax of a derived function is determined by the application that produced it.
Using the above example, the following is a derived function that requires infix syntax which forces a binary usage:
q)1 +\ 1 2 3
2 4 7
However the function can also be applied in a unary way. You are not always constrained by one form of syntax.
The function can be applied unary (or binary) using the bracket notation described earlier:
q)+\[1 2 3]
1 3 6
q)+\[1; 1 2 3]
2 4 7
It can also be isolated with parentheses. The reason why '+\ 1 2 3' fails is because kdb works right to left and tries to evaluate '\ 1 2 3' first which fails. So brackets/parantheses can be used such that the two symbols are considered together:
q)(+\) 1 2 3
1 3 6
q)(+\)[1; 1 2 3]
2 4 7
Compositions
A composition is simply the combination of a unary function with one or more other functions. This can be achieved in multiple ways:
The compose operator '
Combines two functions f and ff where f is a unary function and where ff can have any rank e.g.:
q)f:{2*x}
q)ff:{x+y+z}
q)'[f;ff][1;2;3]
12
Using Apply/Apply At
@ and . can be used to create compositions of functions.
A 'train' of unary functions can be composed using the Apply At (@) symbol, e.g.:
q)c: neg sum reverse@
q)c 1 2 3
-6
Similarly with Apply (.) and a binary function:
q)a: {x%y}
q)c: neg a .
q)c 1 2
-0.5
Note (see Apply section above) the argument must be a list
q)c[1;2]
'rank
q)c[(1;2)]
-0.5
Using ::
It is possible to use (::) to allow the composed function to take any number of inputs:
q)a: {x+y+z}
q)c: neg a ::
q)c[1;2;3]
-6
This is described here. And discussed here. As you will see from the discussion, it is not recommended.