nxCode - DNA Barcode Designer and Decoder

for Next-Gen Sequencing

Usage
Page contents
Work flow with nxCode Decoder Bar-Code design tool
Work flow with nxCode (recommended)
Decide on the size of the barcode-set needed for your experiment, also decide if there are any special biochemical constraints your barcode set needs to meet (such as GC content, restriction sites that must not appear in the barcodes, etc.). Search the online sets available on this website for a barcode-set suitable for your experiment. If you find one, download it. If not, use the nxCode barcode design tool to manufacture your own barcode-set according to your requirements and constraints. Incorporate the barcode-set (or some of it) in your experiment, and run it. Download the nxCode Decoder and run it as specified below on the data received from your sequencer.
nxDecoder
General description
nxCode Decoder is a tool designed to decode sequenced data that contains a given set of DNA barcodes. nxCode is currently optimized for decoding data that was obtained using Illumina's Genome Analyzer - Solexa. Input: FASTA/FASTQ file containing the data to be decoded (normally the output from your sequencer). FASTA file containing the barcode-set that was used in the experiment (can be either downloaded from this website or produced using the nxCode barcode design tool). Useful tip: The barcode file can also contain a description for each barcode (written in the header). This description will then appear in the output file each time a given sequence is decoded as the barcode. Probability table in .xls format, such as the one included in the package (for further discussion, see Algorithmic details). Examples: Input-file Barcode-set file Probability table Output: A FASTA file (or multiple files, if the optional --split parameter is used) containing the input sequenced data, and headers noting the specimen each sequence most likely originated from, along with two numbers: Event Likelihood and Decode Quality. (for more details about these numbers and the methods used for decoding, see Algorithmic details). Examples: output file output directory (if --split is used)
Synopsis
nxDecoder.pl --file_to_decode=s --bc_file=s --prob_table=s [--split] [--graph] [--start_pos_in_oligo=i] [--ligated_with_t] [--clip] [--exact_match] [--min_likelihood=f] [--min_quality=f] [--min_prob_to_consider] [--debug] [--man]
Mandatory input parameters
--file_to_decode=s	Path to a FASTA or FASTQ file containing the sequenced material for decoding.
--prob_table=s	Path and filename of probability table, in .xls format.
--bc_file=s	Path to a FASTA file containing the DNA barcodes used, with optional descriptions as headers in the fasta file.
Optional input parameters
--start_pos_in_oligo=i	The starting position of the barcode in the oligo. Default value: 1
--ligated_with_t	Indicates that the first nucleotide immediately after the barcode in the oligo is T. Use if T nucleotide was used for sticky-end ligation of the barcodes - improves accuracy. Off by default.
--split	Generates a folder containing a FASTA file for each barcode word. Each file will contain ONLY the sequenced data associated with THAT barcode by the decoder.
--small_header	All headers written to output file will contain ONLY the description from the barcode FASTA header.
--exact_match	Decodes only perfect matches of the barcodes.
--min_likelihood=f	Only events at or above event_likelihood f will be printed to the output file. Legal values: 0 <= f <= 10 (For more details about event_likelihood see Algorithmic details ).
--min_quality=f	Only events at or above decode_quality f will be printed to the output file. Legal values: 0 <= f <= 1 (For more details about decode_quality see Algorithmic details ).
--graph	Generates a graph of the abundance of each barcode in the sequenced data.
--clip	Clips the barcode out of the sequence in the output fasta file. Sequences that were not identified (decoded to unknown) will not be clipped.
--min_prob_to_consider	Minimum probability of event to take into account during preprocessing. Higher values speed up preprocessing, at the expense of accuracy (For discussion about min_prob_to_consider see Algorithmic details). Default values: 1e-09 for barcodes <= 10 nucleotides 1e-10 for barcodes of 10 nucleotides 1e-11 for barcodes >= 10 nucleotides
--debug	prints Debug messages to screen.
--man	prints this manual.
Examples
./nxDecoder.pl --file_to_decode=solexa_output.fa --prob_table=prob_table.hash --bc_file=my_barcodes.fa --clip --split ./nxDecoder.pl --file_to_decode=solexa_output.fa --prob_table=prob_table.hash --bc_file=my_barcodes.fa --start_pos_in_oligo=5 --ligated_with_t --graph --min_likelihood=5 --min_quality=0.8
nxCodeBuilder
General description
nxCodeBuilder is a tool for designing custom error-resistant DNA barcode sets for next-gen sequencing. nxCodeBuilder is currently optimized for making barcode sets to be used in Illumina's Genome Analyzer - Solexa. Input: Barcode length (number of nucleotides). Expected Accuracy in decode (for more details about this see Algorithmic details). Probability table in .xls format, comes standard with the package (for further discussion see Algorithmic details). Examples: Barcode length: 6, 7, 8 etc. Expected Accuracy: 0.99, 0.995 etc. Probability table. Output: FASTA file containing the barcode set maid with the given input parameters. This file should be used as input to the decoder when decoding sequenced data obtained using the barcode set. (for more details about the methods used for making a barcode set please see Algorithmic details). Examples: output file
Synopsis
nxCodeBuilder.pl --bc_len=i --exp_acc=f --prob_table=s [--start_pos_in_oligo=i] [--too_long_homopoly=i] [--gc_min=f] [--gc_max=f] [--tm_hist] [--seed_index=i] [--ligated_with_t] [--bc_num_limit=i] [--forbidden_seqs=s] [--min_prob_to_consider=f] [--shuffle] [--debug] [--man]
Mandatory input parameters
--bc_len=i	Number of nucleotides allotted to the barcode.
--exp_acc=f	Expected minimum probability of correct decoding. Legal values: 0 < f < 1 High values produce smaller barcode sets, and vice versa. (for more details about exp_acc see Algorithmic details).
--prob_table=s	Path and filename of probability table in .xls format (for more details about about how prob_table is used in the algorithm see Algorithmic details).
Optional input parameters
--start_pos_in_oligo=i	Position of the barcode in the oligo. Later positions are generally more error-prone, producing smaller barcode sets. Default: 1
--too_long_homopoly=i	Do not consider barcodes with homopolymers of length i or more (e.g., CGAAAATT will not be considered for i=4 or less). Default: 0
--gc_min=f	Minimum GC content of barcode. Legal values: 0 <= f <= 1 Default: 0
--gc_max=f	Maximum GC content of barcodes. Legal values: 0 <= f <= 1 Default: 1
--tm_hist	Draw a melting temperature histogram of the produced barcode set, saved to a file called: tm_hist_hour_min_sec.png Note: requires 'dan' program from EMBOSS package and GD::Graph::bars module. Off by default.
--seed_index=i	Index of the first candidate to be considered in the lexicographic search. Default: 0
--ligated_with_t	Indicates that the first nucleotide immediately after the barcode in the oligo is T. Use if T nucleotide was used for sticky-end ligation of the barcodes - improves accuracy. Off by default.
--bc_num_limit=i	Indicates a limit to the number of needed barcodes. The program will stop considering candidates upon reaching that limit (if it is reached).
--forbidden_seqs=s	Path and filename to a file containing sequences that must not appear in any barcode (e.g., restriction sites). Each forbidden sequence should appear on a line of its own.
--min_prob_to_consider=f	Minimum probability of barcode distortions that will be taken into account during code production. Lower values increase runtime and code size. (for more details about min_prob_to_consider see Algorithmic details).
--shuffle	Considers candidate barcodes in random order, not lexicographic order.
--debug	Prints debug messages during code construction.
--man	Prints this manual.
Examples
./nxCodeBuilder.pl --bc_len=6 --exp_acc=0.999 --prob_table=my_table.xls ./nxCodeBuilder.pl --bc_len=8 --exp_acc=0.995 --prob_table=my_table.xls --gc_min=0.25 --gc_max=0.65 --too_long_homopoly=4 ./nxCodeBuilder.pl --bc_len=9 --exp_acc=0.999 --prob_table=my_table.xls --start_pos_in_oligo=4 --ligated_with_t --forbidden_seqs=./restriction_sites.txt

nxCode - DNA Barcode Designer and Decoder

for Next-Gen Sequencing

Usage

Work flow with nxCode (recommended)

nxDecoder

nxCodeBuilder