Here you'll find a short description and examples of how to use the FASTX-toolkit from the command line.
$ crosstab -h
Cross-Tabulator version 0.0.2
Copyright 2009 (C) by A. Gordon (gordon at cshl.edu)
usage: crosstab [-hl] [-r EXPR] [-c EXPR] [-d EXPR] [-o OUTPUT] [INPUT]
-h = This helpful help screen.
-l = skip first line in input file.
-r EXPR = ROW expression (see expression details below).
-c EXPR = COLUMN expression.
-d EXPR = DATA expression.
-o OUT = Output file name (optional). default is STDOUT.
INPUT = Input file name (must be last argument).
Default is STDIN.
ROW and COLUM Expressions:
The Row/Column/Data expressions control what information is aggregated and
Displayed in the cross-tabulation output matrix.
The simplest expression is a specific column: dollar-sign followed by a
number (e.g. '$5' = fifth column, just like AWK). Be sure to single-quote
The EXPR argument, to prevent shell processing of the dollar sign.
EXPR can be any valid arithmatic expression, with the following operators:
+ - * / % ^ ( ) pow round ceil floor
Data Expression:
The DATA expression must include an aggregation function, one of:
sum, count, average, min, max
in addition to a valid expression (just like the ROW/COLUMN expressions).
Examples:
Cross-tabulate column 4 (as rows) vs. column 7 (as columns),
and sum the values from column 9:
$ crosstab -r '$4' -c '$7' -d 'sum($9)' INPUT > OUTPUT
Cross-tabulate column 5 (groupped into 'windows' of 1000) vs. column 7,
and show the average value from column 9:
$ crosstab -r 'floor($5/1000)*1000' -c '$7' -d 'average($9)' INPUT > OUTPUT