The most reliable reference set to evaluate prophages prediction comes from a collection of annotated prophages provided by S. Casjens (extension from (Casjens, 2003)). The original, manually annotated, set contains 329 prophages. Prophages with less than 7 proteins were filtered out from the set since considered as too short or ambiguous.
We included in this evaluation another prophage detection method called Phage_Finder (Fouts, 2006). In order to compare prediction results between Prophinder and Phage_Finder, we took the list of screened genomes available on the Phage_Finder web site (http://phage-finder.sourceforge.net/302_bact_gen.pdf) and removed those not annotated by S. Cajens. The resulting reference set is composed of 54 genomes with 287 annotated prophages that were used for this evaluation. We want to stress out that the predictions extracted from the Phage_Finder web site were produced using conservative parameters and therefore the evaluation result presented here is not representative of the real predictive capabilities of the method. Running Phage_Finder on all bacterial genomes using optimal/optimized parameters goes beyong the scope of our work and would require too much CPU time for our compuational infrastructure. Therefore, the evaluation result for Phage_Finder is only indicative and should be interpreted with objectivity.
Out of the 54 GenBank entries used for the evaluation, 179 prophages were predicted by Phage_Finder and 238 by Prophinder.
Predicted prophages were compared to the reference set using the compare-classes program (http://rsat.scmbb.ulb.ac.be/rsat/compare-classes_form.cgi), from the RSA-Tools software package (van Helden, 2003), which assesses the statistical significance of common members between reference and predicted prophages. The raw outputs are available for Phage_Finder and Prophinder. The table below presents, for each evaluated bacterial genome, the list of prophages detected by Prophinder ('prophinder:' IDs) and Phage_Finder ('phage_finder:' IDs) with the compare-classes Evalue and the number of coding sequences in common with the reference prophages ('sherw:' IDs). False positives and false negatives for both methods are listed at the end of each GenBank table.
Two standard measures have been used to evaluate the predictive capabilities of the methods:
1) The sensitivity which is calculated as the fraction of annotated prophages that are predicted: