Cross-Tabulation tool

for Galaxy

Here you'll find a short description and examples of how to use the FASTX-toolkit from the command line.

Command Line Arguments

	$ crosstab -h
	Cross-Tabulator version 0.0.2
	Copyright 2009 (C) by A. Gordon (gordon at cshl.edu)

	usage: crosstab [-hl] [-r EXPR] [-c EXPR] [-d EXPR] [-o OUTPUT] [INPUT]

	  -h      = This helpful help screen.
	  -l      = skip first line in input file.
	  -r EXPR = ROW expression (see expression details below).
	  -c EXPR = COLUMN expression.
	  -d EXPR = DATA expression.
	  -o OUT  = Output file name (optional). default is STDOUT.
	  INPUT   = Input file name (must be last argument).
		    Default is STDIN.

	ROW and COLUM Expressions:
	 The Row/Column/Data expressions control what information is aggregated and
	 Displayed in the cross-tabulation output matrix.

	 The simplest expression is a specific column: dollar-sign followed by a 
	 number (e.g. '$5' = fifth column, just like AWK). Be sure to single-quote
	 The EXPR argument, to prevent shell processing of the dollar sign.

	 EXPR can be any valid arithmatic expression, with the following operators:
	    + - * / % ^ ( ) pow round ceil floor

	Data Expression:
	 The DATA expression must include an aggregation function, one of:
	    sum, count, average, min, max
	 in addition to a valid expression (just like the ROW/COLUMN expressions).

	Examples:

	 Cross-tabulate column 4 (as rows) vs. column 7 (as columns), 
	 and sum the values from column 9:
	    $ crosstab -r '$4' -c '$7' -d 'sum($9)' INPUT > OUTPUT

	 Cross-tabulate column 5 (groupped into 'windows' of 1000) vs. column 7,
	 and show the average value from column 9:
	    $ crosstab -r 'floor($5/1000)*1000' -c '$7' -d 'average($9)' INPUT > OUTPUT