Genomic island (GI) is a cluster of genes in prokaryotic genomes that have probable horizontal origins. These genetic elements have been associated with rapid adaptations in prokaryotes that are of medical, economical or environmental importance, such as pathogen virulence, antibiotic resistance, symbiotic interactions, and notable secondary metabolic capabilities. The recent development of their detection methods has led to significant advances in our understanding of microbial evolution and function. Despite these advances, several challenges still exist regarding the use of multiscale features, detection algorithm and boundary identification.

We propose a multiscale statistical algorithm MTGIpick that uses small-scale test with large-scale features to score small region deviating from the host and large-scale statistical test with small-scale features to identify multi-window segments for identification of genomic islands. MTGIpick can identify genomic islands from a single genome, without annotated information of genomes or prior knowledge from other datasets. In simulations with alien fragments from artificial and real genomes, MTGIpick reported robust results across different experiments. From real biological data, MTGIpick demonstrated better performance compared with existing methods, and identified genomic islands with more accurate size.

How to install and use the MTGIpick in different platforms, click Documentation for detailed information.

Please cite: Qi Dai, Chaohui Bao, Yabing Hai, Sheng Ma, Tao Zhou, Cong Wang, Yunfei Wang, Xiaoqing Liu, Yuhua Yao, Wenwen Huo, Zhenyu Xuan, Min Chen and Michael Q Zhang. MTGIpick allows the robust identification of genomic islands from a single genome. To be submitted to Briefings in Bioinformatics.