- Generation of an EvoPrint
EvoPrinter is a comparative genomics tool for discovering multispecies conserved sequences (MCSs) in orthologous DNA from 3 or more related species. Its algorithm takes multiple BLAT readouts of individual pairwise reference-DNA vs test-genome alignments and identifies sequences in the reference-DNA that are invariant in all species tested.
Obtaining a good cis-DECODER alignment begins with generating good EvoPrints. Methods for generating EvoPrints are described in detail at the EvoPrint web site. Please keep in mind that species selection is crucial: going too far evolutionarily from the reference species may result in loss of crucial regulatory sequence and not going far enough results in a collection of sequences that have not been subject to sufficient evolutionary tests of functionality. Examine the EvoDifference print of each test species for blocks of lost CSBs due to sequencing gaps, inversions, or species specific changes.
- EvoPrint-parser
EvoPrint-parser takes the output of EvoPrinter and converts it to a form that can be used by cis-DECODER alignment tools. EvoPrint-parser extracts and annotates Conserved Sequence Blocks (CSBs) from the EvoPrint and outputs them in forward and reverse-complemented directions. CSBs from functionally related enhancers are grouped together in CSB-libraries.
- CSB-aligner
CSB-aligner discovers short sequence elements that are shared by different enhancers. The output of CSB-alignment consists of cis-Decoder tags (cDTs), sequences varying from 6 bases and greater that align with elements that are shared between two or more enhancers. A second output consists of a results table of cDTs that can be sorted by selecting the title bars. The alignment program requires two inputs: (1) the upper window accepts CSBs generated by the parser program (2) the lower window accepts an EvoPrint or CSBs of a promoter element of interest that is to be aligned with the parsed CSBs.
- Generation of cDT-libraries
The steps for identifying cDTs and generating cDT-libraries are described at the CSB-alignment site. A cDT-library consists of groups of cDTs that are shared between functionally related enhancers genes (specific libraries) or enhancers of divergent functions (common libraries).
- cDT-scanner
cDT-scanner annotates the CSBs of given enhancers with sequences shared with other enhancers (available at the cDT library site), including cDTs that are present in selected groups of enhancers directing expression to similar tissues or cDTs that are shared by enhancers directing expression to different tissues. cDT-scanner requires two inputs: 1) a list of tissue specific or common cDTs loaded into the upper window and 2) an EvoPrint or CSBs to be scanned, copy/pasted into the lower window. The first output of cDT-scanner is an alignment of the EvoPrint or CSBs with the input cDTs at positions of perfect matches. A second output consists of a results table of aligning cDTs that can be sorted by selecting the title bars.
- Full-enhancer scanner
Full-enhancer scanner searches the entire EvoPrinted enhancer sequence (both conserved and non-conserved regions) for sequences that have previously been identified in the cDT scans of the conserved regions. Certain enhancers, particularly those controlling the dynamic expression of developmental genes, contain multiple DNA-binding sites for one or more specific transcription factors. Comparative studies of orthologous enhancers have also revealed that within a binding site cluster individual DNA-binding sites can undergo turnover and thus would not be identified by EvoPrinting as conserved.
- cDT-cataloger
cDT-cataloger prepares a list of enhancer CSBs that contain elements identical to a list of cDTs usually derived from a cDT library. Cataloguing is accomplished by copy-pasting into the cDT-cataloger CSBs in both forward and reverse direction in the upper window, and the selected cDTs of the same size in the lower window. The output consists of lists of CSBs that align with each input cDT. The second output consists of a results table that can be sorted by selecting the title bars.