Popup window
User: Password:
 
Prophinder: File formats
Help navigation Prophinder home Genome view description Prophage view description Prophage ACLAME hits Genomic region view File formats

Different file formats are used in Prophinder. All the files are simple text format, occasionally compressed with the gzip application.
Submitted GenBank files to Prophinder

Prophinder requires properly formatted GenBank files to perform the predictions. The mandatory fields (features) are:
- The LOCUS line. If the genome has no accession number yet, place a tag (e.g.: NO_ID) instead of the accession number.
Examples:
A GenBank file from NCBI

LOCUS       NC_003030            3940880 bp    DNA     circular BCT 18-JAN-2006
A GenBank file without accession
LOCUS       NO_ID                3940880 bp    DNA     circular BCT 18-JAN-2006
- The SOURCE line. Required to provide the minimal information about the genome analysed either on the web interface or on through the web service.
Example:
SOURCE      Clostridium acetobutylicum ATCC 824
- The list of coding sequences (CDS). Prophinder uses the information from the CDS to perform the predictions. The mandatory key is the translation with the polypetidic chain coded by the corresponding gene. Keys such as locus_tag, product and protein_id would be useful for easier analysis of the results on the web interface.
Example:
     CDS             complement(27017..27622)
                     /locus_tag="CAC0019"
                     /codon_start=1
                     /transl_table=11
                     /product="Transcriptional regulator, AcrR family"
                     /protein_id="NP_346666.1"
                     /db_xref="GI:15893317"
                     /db_xref="GeneID:1116202"
                     /translation="MDKEVRKPQQKRSIEKKKRILDAANTLLLKNGYYDITTADIAKA
                     AGLSTGTVYAYFKDKKDILLSSLYESSKSFREQTLNELDKISQNDNPVNTIKNVLQIF
                     IKFHTSYPKKYHDELMSLSYIDEDVRGFFENIKNTMMDAVVKHLKKCGINLKHEKEQS
                     FLIYSLIENIEDELVFDIYPDLNKNILIDECARVIVNMIMD"
- The genomic sequence. Required for additional searches like direct repeats.


Predicted prophages

All prophages have the identifier, start - end coodinates and first - last CDS GI numbers as columns separated by a tab character. Prophinder predictions have the score, iteration number and window size as supplementary columns. Commented lines start with the hash (#) charcter:

Predicted prophages Comments GenBank accession Authors Column headers Prophinder set IDs column Start coordinates End coordinates First GIs Last GIs Scores column Iterations column Window sizes column

Annotated GenBank file

For the genome view page, you can download the GenBank file with the displayed prophages annotated in it. The annotation consists in adding, as db_xref keys, the prophage IDs to each gene feature present in such prophages and the protein ID of the first ACLAME hit to the equivalent CDS feature:

Annotated GenBank format Gene db_xref CDS db_xref CDS db_hit

Web service: accession numbers list

The getAccList method returns the list of GenBank accession numbers that have been screened by Prophinder. The list is composed of accession numbers separated by a comma. Example:

Accession list format

Web service: predicted prophages

The getProphages method returns the list of predicted prophages for a given GenBank accession number. A tab-delimited table is returned with one prophage definition perl line. Example:

Accession list format Prophage IDs Start coordinates column End coordinates column Sig scores column Iterations column Window sizes column

Web service: predicted direct repeats

The getDRs method returns the list of predicted direct repeats (DRs) for a given prophage ID. The prophage ID can be in its full form: <NC accession number>:<dataset>:<prophage UID>, just <dataset>:<prophage UID> or only the unique identifier (UID) <prophage UID>. A tab-delimited table is returned with one DR definition perl line. Example:

DRs list format Left DR coordinates column Right DR coordinates column DR sizes column

Web service: annotated GenBank file

The getAnnotatedGbk method returns the genome as a GenBank entry with the prophages annotated in it. The format is exactly the same as for the GenBank formatted file sent from the web interface.


Help navigation Prophinder home Genome view description Prophage view description Prophage ACLAME hits Genomic region view File formats