| EvoPrinterHD for bacteria |
|
Introduction EvoPrinter is a comparative genomics tool for discovering conserved DNA sequences that are shared among three or more orthologous DNAs (Odenwald et al., 2005). Only a single curated DNA sequence is required to initiate the rapid comparative analysis. Generated from multiple pairwise BLAT alignments (Kent, 2002), an EvoPrint presents an ordered, uninterrupted representation of evolutionarily resilient sequences within the user's DNA of interest. EvoPrinterHD is a 2nd-generation comparative tool that automatically superimposes higher-resolution alignments, obtained using an enhancedBLAT (eBLAT) protocol, to give an enhanced view of sequence conservation between evolutionarily distant species (Yavatkar et al., 2008). EvoPrinter is currently available for 17 Staphlococcus, 22 enteric bacteria, and 17 Streptococcus genomes. The following algorithms were developed to help identify evidence for horizontal gene transfer DNA sequences: (1) an EvoUnique profile highlights unique or uniquely shared sequences among subsets of genomes that are otherwise absent from the other genomes included in the analysis; (2) a repeat finder detects putative mobile genetic element sequences based on the repetitive presence of their sequences within bacterial chromosomes; (3) an EvoDifferences profile portrays, in a single view, those sequences that are detected in all but one of the genomes included in the analysis, and (4) input reference DNA exchange allows for re-initiation of the comparative analysis using the aligning region of another genome, thus facilitating the search for unique differences among the genomes included in the analysis. EvoPrinterHD also includes algorithms that identify sequence rearrangements in the aligning regions of the test genomes. Described below is a list of steps that should be followed for the EvoPrinter analysis of bacterial DNA:
A region of the E. coli EDL933 genome from bases 376466-386461, which includes the choline transport protein BetT gene, was subject to EvoPrint analysis. The scorecard reveals the following: 1)The only two genomes with complete homology to the ~10,000 base input sequence, as revealed by the first scores in each column, was the test genome E. coli EDL933 and the closely related genome E. coli Sakai. 2) Second and third scores were low, indicating a low level of sequence rearrangement. 4) A second level of homology was indicated by most of the other E. coli species in the analysis. 5) A third level of homology is indicated by the lower score against Shigella flexneri 5str8401. 6) Boxes were checked to include only seven species in the final EvoPrint, since no other genomes contained homologous sequences to the reference sequence. A sample choline transport protein BetT region EvoPrint readout, that includes an EvoPrint, an EvoUnique print and an EvoDifferences profile, is given for the E. coli EDL933 genomic region. The EvoPrint reveals, in uppercase black letters, bases that are in the E. coli EDL933 reference sequence that are conserved in the E. coli Sakai, E. coli K12MG 1655, E. coli CFT073, E. coli 536, E. coli UTI89, E. coli APEC 01 and Shigella flexneri 5str8401 orthologous DNAs. In the EvoUnique print, uppercase red-colored letters represent bases that are present only in the reference species, uppercase green-colored letters represent bases that are shared by only one of the test species and blue-colored uppercase letters represent bases that are shared with two of the test species. Lowercase gray-colored bases are common to three or more of the test species aligning regions. The Evodifferences Profile - Relaxed EvoPrint contains results that are color coded, revealing bases that are not conserved in one of the test species. The red color coding reveals bases that are uniquely absent from Shigella flexneri 5str8401. Further BLAST analysis of the the three regions (not shown) revealed by the EvoPrint analysis shows that the upper region, conserved in all species used for the final EvoPrint, encodes a choline transport protein BetT that is conserved in all E. coli species used in the analysis except E. coli K12W 3110. Of all the Shigella species used in the analysis, the sequence is found only in Shigella flexneri 5str8401. The central portion of the EvoPrint, indicated by green capital letters in the EvoUnique print, encodes a putative outer membrane autotransporter (AidA-I adhesin-like protein) that is present only in E. coli EDL933, E. coli SAKAI, and a few other E. coli genomes, as revealed by BLAST, that were not used in the EvoPrint analysis. The blue bases of the EvoUnique print are bases conserved in the AidA gene of E. coli 53638 (divergent with respect to the EDL933/SAKAI sequence). The lower portion of the sequence, revealed as capital letters as being exclusively absent from Shigella flexneri 5str8401, in comparison to its presence in other genomes used in the analysis, encodes a LuxR-family transcriptional regulator/cyclic diguanylate phosphodiesterase (EAL) domain protein that is present only in the E. coli genomes and absent from Shigella clones. In summary, EvoPrint analysis reveals three regions in the reference sequence distinguished by their presence or absence from the test species. References
Odenwald WF, Rasband W, Kuzin A and Brody T. (2005). EvoPrinter, a multigenomic comparative tool for rapid identification of functionally important DNA. Proc. Natl. Acad. Sci. 102: 14700-5. Kent WJ. (2002). BLAT-- the BLAST-like alignment tool. Genome Res. 12: 656-64. Yavatkar AS, Lin Y, Ross J, Fann Y, Brody T and Odenwald WF. (2008). Rapid detection and curation of conserved DNA via enhanced-BLAT and EvoPrinterHD analysis. BMC Genomics. Return to EvoPrinterHD home. |