nxCode - DNA Barcode Designer and Decoderfor Next-Gen Sequencing |
||||||
Main | Download barcode sets | Download software | Usage | Algorithmic details | FAQ |
Usage | |
Page contents | |
Work flow with nxCode (recommended) |
|
|
|
nxDecoder |
|
General description | |
nxCode Decoder is a tool designed to decode sequenced data that contains a given set of DNA
barcodes. nxCode is currently
optimized for decoding data that was obtained using Illumina's Genome Analyzer - Solexa. Input:
Output: A FASTA file (or multiple files, if the optional --split parameter is used) containing the input sequenced data, and headers noting the specimen each sequence most likely originated from, along with two numbers: Event Likelihood and Decode Quality. (for more details about these numbers and the methods used for decoding, see Algorithmic details). Examples:
|
|
Synopsis | |
nxDecoder.pl --file_to_decode=s --bc_file=s --prob_table=s [--split] [--graph] [--start_pos_in_oligo=i] [--ligated_with_t] [--clip] [--exact_match] [--min_likelihood=f] [--min_quality=f] [--min_prob_to_consider] [--debug] [--man] |
|
Mandatory input parameters | |
--file_to_decode=s | Path to a FASTA or FASTQ file containing the sequenced material for decoding. |
--prob_table=s | Path and filename of probability table, in .xls format. |
--bc_file=s | Path to a FASTA file containing the DNA barcodes used, with optional descriptions as headers in the fasta file. |
Optional input parameters | |
--start_pos_in_oligo=i | The starting position of the barcode in the oligo. Default value: 1 |
--ligated_with_t | Indicates that the first nucleotide immediately after the barcode in the oligo is T. Use if T nucleotide was used for sticky-end ligation of the barcodes - improves accuracy. Off by default. |
--split | Generates a
folder containing a FASTA file for each barcode word. Each file
will contain ONLY the sequenced data associated with THAT barcode by the decoder. |
--small_header | All headers written to output file will contain ONLY the description from the barcode FASTA header. |
--exact_match | Decodes only perfect matches of the barcodes. |
--min_likelihood=f | Only events at or above event_likelihood f will be printed to the output file. Legal values: 0 <= f <= 10 (For more details about event_likelihood see Algorithmic details ). |
--min_quality=f | Only events at or above decode_quality f will be printed to the output file. Legal values: 0 <= f <= 1 (For more details about decode_quality see Algorithmic details ). |
--graph | Generates a graph of the abundance of each barcode in the sequenced data. |
--clip | Clips the
barcode out of the sequence in the output fasta file. Sequences
that were not identified (decoded to *unknown*) will not be
clipped. |
--min_prob_to_consider | Minimum
probability of event to take into account during preprocessing.
Higher values speed up preprocessing, at the expense of accuracy
(For discussion about min_prob_to_consider see Algorithmic details). Default values: 1e-09 for barcodes <= 10 nucleotides 1e-10 for barcodes of 10 nucleotides 1e-11 for barcodes >= 10 nucleotides |
--debug | prints Debug messages to screen. |
--man | prints this manual. |
Examples | |
./nxDecoder.pl --file_to_decode=solexa_output.fa --prob_table=prob_table.hash --bc_file=my_barcodes.fa --clip --split ./nxDecoder.pl --file_to_decode=solexa_output.fa --prob_table=prob_table.hash --bc_file=my_barcodes.fa --start_pos_in_oligo=5 --ligated_with_t --graph --min_likelihood=5 --min_quality=0.8 |
|
nxCodeBuilder |
|
General description | |
nxCodeBuilder is a tool for designing custom error-resistant DNA barcode sets for next-gen sequencing. nxCodeBuilder is currently optimized for making barcode sets to be used in Illumina's Genome Analyzer - Solexa. Input:
Output: FASTA file containing the barcode set maid with the given input parameters. This file should be used as input to the decoder when decoding sequenced data obtained using the barcode set. (for more details about the methods used for making a barcode set please see Algorithmic details). Examples: | |
Synopsis | |
nxCodeBuilder.pl --bc_len=i --exp_acc=f --prob_table=s [--start_pos_in_oligo=i] [--too_long_homopoly=i] [--gc_min=f] [--gc_max=f] [--tm_hist] [--seed_index=i] [--ligated_with_t] [--bc_num_limit=i] [--forbidden_seqs=s] [--min_prob_to_consider=f] [--shuffle] [--debug] [--man] |
|
Mandatory input parameters | |
--bc_len=i | Number of nucleotides allotted to the barcode. |
--exp_acc=f | Expected minimum probability of correct decoding. Legal values: 0 < f < 1 High values produce smaller barcode sets, and vice versa. (for more details about exp_acc see Algorithmic details). |
--prob_table=s | Path and filename of probability table in .xls format (for more details about about how prob_table is used in the algorithm see Algorithmic details). |
Optional input parameters | |
--start_pos_in_oligo=i | Position of the barcode in the oligo. Later positions are generally more error-prone, producing smaller barcode sets. Default: 1 |
--too_long_homopoly=i | Do not consider barcodes with homopolymers of length i or more (e.g., CGAAAATT will not be considered for i=4 or less). Default: 0 |
--gc_min=f | Minimum GC content of barcode. Legal values: 0 <= f <= 1 Default: 0 |
--gc_max=f | Maximum GC content of barcodes. Legal values: 0 <= f <= 1 Default: 1 |
--tm_hist | Draw a melting temperature histogram of the produced barcode set, saved to a file called: tm_hist_hour_min_sec.png Note: requires 'dan' program from EMBOSS package and GD::Graph::bars module. Off by default. |
--seed_index=i | Index of the first candidate to be considered in the lexicographic search. Default: 0 |
--ligated_with_t | Indicates
that the first nucleotide immediately after the barcode in the
oligo is T. Use if T nucleotide was used for sticky-end
ligation of the barcodes - improves accuracy. Off by default. |
--bc_num_limit=i | Indicates
a limit to the number of needed barcodes. The program will stop
considering candidates upon reaching that limit (if it is reached). |
--forbidden_seqs=s | Path and filename to a file containing sequences that must not appear in any barcode (e.g., restriction sites). Each forbidden sequence should appear on a line of its own. |
--min_prob_to_consider=f | Minimum
probability of barcode distortions that will be taken into account
during code production. Lower values increase
runtime and code size. (for more details about min_prob_to_consider see Algorithmic details). |
--shuffle | Considers candidate barcodes in random order, not lexicographic order. |
--debug | Prints debug messages during code construction. |
--man | Prints this manual. |
Examples | |
./nxCodeBuilder.pl --bc_len=6 --exp_acc=0.999 --prob_table=my_table.xls ./nxCodeBuilder.pl --bc_len=8 --exp_acc=0.995 --prob_table=my_table.xls --gc_min=0.25 --gc_max=0.65 --too_long_homopoly=4 ./nxCodeBuilder.pl --bc_len=9 --exp_acc=0.999 --prob_table=my_table.xls --start_pos_in_oligo=4 --ligated_with_t --forbidden_seqs=./restriction_sites.txt |