Next Previous Contents

9. Limits, a few minor ones

The following limits apply.

There must not be any ASCII TAB characters in the data. This is the primary limit as the ASCII TAB character is the delimiter in tables. The following names are reserved to the awk language, and should not be used to indicate column names:

BEGIN, END, break, continue, else, exit, exp, for, getline, if, in, index, int, length, log, next, print, printf, split, sprintf, sqrt, substr, while, and possibly others, depending on the implementation of your awk (i.e. mawk, gawk, etc.). Refer to the man page and the documentation of you awk interpeter.

Horizontal TABs and newlines, although forbidden as such in table data, can be conveniently represented by means of the ASCII strings '\t' and '\n' respectively. This rule applies to tables only. Files in 'list' format can contain any characters literally, including physical TABs and newlines.

The number of columns in a table may be limited to 32.768 by some AWK implementations (I think mawk is one of those). It should not be a problem though, as it is a very high number anyway. In spite of this, mawk is very fast and I recommend it over other AWK implementations.

A more serious drawback of the operator-stream paradigm is that it is process-based. This means that an average pipeline will open several processes at once, one or more for each operator. On complex queries this can lead to exceed the max. No. of child processes allowed by your operating system. This limit is O.S. specific and it can usually be overcome by getting the system administrator to increase this value as needed.


Next Previous Contents