A cDT, is a short sequence element of 6 bp or greater that is a perfect match to sequences within CSBs that are present in two or more enhancers. A cDT-library represents a collection of cDTs that are shared by the various enhancers examined. Two types of cDT-libraries have been generated in this study. First, a 'tissue-specific library' contains cDTs that are shared by a group of enhancers that regulate similar expression patterns but are absent from a second set of enhancers that direct expression in tissues outside of the first group. Second, a 'common cDT-library' contains cDTs that were shared between sets of enhancers of divergently regulated genes. A subset of common libraries included 'enriched' libraries that had a 3 fold greater representation from one enhancer type (e.g. neural) than from a second type (e.g. mesodermal).
All libraries were generated from readouts of the CSB-aligner. Making enhancer-type specific libraries requires two different CSB-libraries generated from functionally different enhancers, a library from the tissue of interest (e.g. neural), and a second library that serves as an 'out-group' (e.g. mesodermal). For the generation of a neural cDT-library, neural CSBs in both forward and reverse directions were copy/pasted into both upper and lower windows of CSB-aligner. The resulting cDTs from this alignment are listed in the 'Result of CSB alignment table' of the CSB-aligner output, in the column titled 'Motif.' Since this cDT list contains multiple copies of different cDTs, the extra copies are removed using the Java applet Puzzamatic 1.0, a freeware created by Ron Surratt. The cDT list that contains all unique cDTs is then alphabetized and sorted by size also using Puzzamatic 1.0. The cDTs, constituting a raw neural cDT-library, were then copy/pasted into a Microsoft Word document. A second CSB-alignment is then performed with the neural CSBs in the top window of CSB-aligner, and mesodermal CSBs in the lower window. The cDTs from this alignment were freed of extra copies as above. These cDTs constituted an unedited common neural/mesodermal cDT-library. The unedited neural and common cDT-libraries are combined and cDTs common to the two libraries (present in the first and second alignments) are both removed using the JavaScript program cDT-cleaner, thus leaving only the neural-specific sequences. Neural enriched and common cDTs were curated from the unedited shared cDT-library.
Frequency of sequence matches for each cDT was assessed using cDT-cataloger. For example the cDT AACGTT is annotated m2;n0;s0-AACGTT to indicate that there were 2 hits on a Drosophila mesodermal CSB library, no on a neural CSB library and no hits on a segmental CSB library. Presence of a full list of cDTs and their reverse-complement is assured using the cDT-cataloger program, especially the answer table that is generated, 'Results of cDT cataloger', which allows sorting of lists and generation of reverse-complements of cDTs.