Document Type


Publication Date


Publication Title








Publisher Name



High-throughput sequencing of microbial communities has uncovered a large, diverse population of phages. Frequently, phages found are integrated into their bacterial host genome. Distinguishing between phages in their integrated (lysogenic) and unintegrated (lytic) stage can provide insight into how phages shape bacterial communities. Here we present the Prophage Induction Estimator (PIE) to identify induced phages in genomic and metagenomic sequences. PIE takes raw sequencing reads and phage sequence predictions, performs read quality control, read assembly, and calculation of phage and non-phage sequence abundance and completeness. The distribution of abundances for non-phage sequences is used to predict induced phages with statistical confidence. In silico tests were conducted to benchmark this tool finding that PIE can detect induction events as well as phages with a relatively small burst size (10×). We then examined isolate genome sequencing data as well as a mock community and urinary metagenome data sets and found instances of induced phages in all three data sets. The flexibility of this software enables users to easily include phage predictions from their preferred tool of choice or phage sequences of interest. Thus, genomic and metagenomic sequencing now not only provides a means for discovering and identifying phage sequences but also the detection of induced prophages.


Author Posting © The Authors, 2023. This article is posted here by permission of MDPI for personal use, not for redistribution. The article was published in Viruses, Volume 15, Issue 2, February 2023,

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Included in

Mathematics Commons