Popup window
User: Password:
 
Prophinder: Web service
Introduction

We are providing a web service for the direct use of Prophinder through the Internet. This facility avoids the necessity to download and install all the data, libraries, software and other external dependencies required to run Prophinder. The other advantage being that the very latest version of the software will be always used as well as the latest data available in the ACLAME database.

On this page you can access the clients which are programs used to access the predictions or to submit a GenBank file to the Prophinder web service.
A small tutorial is provided for the integration of the web service in a pipeline.


Clients

The clients require the following Perl modules: SOAP::Lite, SOAP::WSDL and MIME::Entity. They can be installed through the cpan interface on Linux machines. The clients are Perl scripts that must be executed on Linux or other UNIX compatible platforms.
To submit a GenBank file to Prophinder, download (right click and select "Save Link As...") the submit_prophinder.pl program.
Execute the program with the -h argument to obtain the list of options available.

submit_prophinder.pl -h
The only mandatory argument being -gbk <GenBank file name> to specify the GenBank file to be submitted to Prophinder. The file can be compressed with the Gzip program and in that case the .gz extension must be present in the file name.
The program can be run in two modes: 1) submit the GenBank file and exit or 2) submit the GenBank file and wait for the results.
Although Prophinder is quite fast (around 5 minutes per genome), it might be necessary to wait much longer if several users have submitted a job before yours. Better use the first solution and wait for the email notification to access the results.

Some examples, where the submitted GenBank file is called my_genome.gbk:
Submitting the genome and specifying the email address to get notification of the job completion:

submit_prophinder.pl -gbk my_genome.gbk -email='raphael@scmbb.ulb.ac.be'
Same but wait for the results and store them in the output file my_prophages.out:
submit_prophinder.pl -gbk my_genome.gbk -email='raphael@scmbb.ulb.ac.be' -wait -out my_prophages.out
With the window sizes to use for screening the genome:
submit_prophinder.pl -gbk my_genome.gbk -email='raphael@scmbb.ulb.ac.be' -wait -out my_prophages.out -ws 300,100,20
Including the minimum number of CDS and the minimum number of ACLAME hits to be present in the prophage for acceptance:
submit_prophinder.pl -gbk my_genome.gbk -email='raphael@scmbb.ulb.ac.be' -wait -out my_prophages.out -ws 300,100,20 -minCds 15 -minHits 7
To activate the secondary search procedure, use the -ss option. This will permit to detect low signalling prophages, but there is a small chance to get more false positives.

To retrieve predictions from pre-screened genomes in the Prophinder database or from a submitted genome, download (right click and select "Save Link As...") the access_prophinder.pl program.
Execute the program with the -h argument to obtain the list of available options.

access_prophinder.pl -h
With this program, you can retrieve ready-made predictions using an accession number (the -acc option) or prediction results of a submitted GenBank file using the prediction identifier (the -pred option).

Some examples:
Retrieve predictions for Escherichia coli O157:H7 EDL933:

access_prophinder.pl -acc NC_002655
Retrieve prediction results from a submitted GenBank file with the submission ID 3e4474033c72c40eb0981f27f73f525e:
access_prophinder.pl -pred 3e4474033c72c40eb0981f27f73f525e
Retrieve and store predictions in the output file name prophinder.out:
access_prophinder.pl -acc NC_002655 -out prophinder.out


Pipeline integration

Integrating a web service in a pipeline or a program is somehow straightforward. Here, the integration in Perl is described.
The Perl modules required are: SOAP::Lite, SOAP::WSDL and MIME::Entity. They can be installed through the cpan interface on Linux machines.

Include the Perl modules, cited above, in the code:

use SOAP::Lite;
use SOAP::WSDL;
use MIME::Entity;
In all cases, you need to create the SOAP::WSDL object used for the client-server transactions. The WSDL file to provide is located at http://aclame.ulb.ac.be/Tools/Prophinder/prophinder.wsdl:
my $ws = SOAP::WSDL->new(wsdl => 'http://aclame.ulb.ac.be/Tools/Prophinder/prophinder.wsdl');
$ws->wsdlinit();
$ws->servicename('ProphinderService');

Submitting a GenBank file for prediction

To submit a GenBank file to Prophinder, the following steps are required:
1) Ask a ticket to the server:
$som = $ws->call('getTicket');
my $ticket = $som->result();
2) Define the execution parameters if needed:
$som = $ws->call('setParams', ('ticket' => $ticket, 
          'params' => 'email=raphael@scmbb.ulb.ac.be;win_size=300,100,20;secondary_search=1'));
3) Prepare the GenBank file for submition. The file MUST BE gzipped. The gzipped file is then associated with the SOAP message as follow:
# Compress the file
system("gzip my_genome.gbk");
# Create the attachment entity
my $ent = MIME::Entity->build('Id' => 'gbk', 'Type' => 'application/gzip', 
                              'Encoding' => 'base64', 'Path' => 'my_genome.gbk.gz', 
                              'Disposition' => 'inline');
Note: the 'Disposition' option must be 'inline', the option 'attachment' was previously used but seems not working anymore with recent (2010) versions of SOAP::Lite or SOAP::WSDL.
4) Send the GenBank file to the web service. Since sending messages with attachments do not seems working with the Perl module SOAP::WSDL, here a direct call using SOAP::Lite is made:
$som = SOAP::Lite
    ->readable(1)
    ->uri('urn:Prophinder')
    ->parts([ $ent ])
    ->proxy('http://aclame.ulb.ac.be/perl/Aclame/Soap/prophinder.cgi')
    ->sendFile(SOAP::Data->name("ticket" => $ticket));
5) Finally, submit the job to the Prophinder queueing system:
$som = $ws->call('submit', ('ticket' => $ticket));
At this stage, you'll need to wait until the job is executed on the Prophinder server. Depending on the amount of jobs already submitted, the waiting time can vary greatly. However, we do not expect hundreds of jobs being permanently on the queue and therefore, the job should be completed within 5 to 30 minutes.

Error handling

A call to the Prophinder web service might fail. We are handling all sort of possible errors on the server and you can check at any stage the success of the call and in case of an error, get back the error message from the server. This can be done after each call very easily as follow:
if ($som->fault()) {
	printf STDERR "A fault (%s) occured: %s\n", $som->faultcode(), $som->faultstring();
}
Before contacting the ACLAME team about an error, please check 1) the content of the error message, 2) if the call has been made properly, 3) if the ACLAME web site is not down or 4) you have proper internet access.
If you need to report a problem to the ACLAME team, provide the following information:
  • The operating system being used
  • The (piece of) source code being used to access the web service
  • The compressed GenBank file if the problem is related to that part

Accessing the predictions

To get the list of GenBank accession numbers that have been screened by Prophinder, call the getAccList method.
my $som = $ws->call('getAccList');
unless ($som->fault()) {
	my $list = $som->result();
	unless ($list) {
		print STDERR "No accession ID returned by the web service!\n";
	} else {
		print "Accession numbers are: ", $list, "\n";
		# Convert it in an array
		my @list = split(/\s*,\s*/, $list);
	}
}
To get the predictions consensus for a given accession number, call the getProphages method with only the accession number as argument. If no prophage found, an empty string is returned.
The accession argument can be used either to specify a GenBank accession number corresponding to an existing bacterial choromosome or plasmid, or to specify a Prophinder prediction ticket (see above). In the second case, you'll retrieve data about predictions made on your submitted GenBank file.
$som = $ws->call('getProphages', ('accession' => 'NC_002655', 'wsizes' => ''));
unless ($som->fault()) {
	my $prophages = $som->result();
	unless ($prophages) {
		print "No prophage predicted in this genome.\n";
	} else {
		print "Predicted prophages are:\n$prophages\n";
		# Extract each prophage in a separate line:
		my @prophages = split(/\n/, $prophages);
		# Extract data for each prophage:
		foreach my $proph (@prophages) {
			my @data = split(/\t/, $proph);
			print "Proph ID = ", $data[0], "\n";
			print "Genomic coord start = ", $data[1], "\n";
			print "Genomic coord end = ", $data[2], "\n";
			print "Sig score = ", $data[3], "\n";
			print "Iteration = ", $data[4], "\n";
			print "Window size = ", $data[5], "\n";
		}
	}
}
To get the predictions from a given window size, specify the 'wsizes' argument with one or several values separated by a comma (',' character). Valid values should be 20,50,100,200,300.
To retrieve data from the direct repeats search, use the getDRs method with a prophage ID as argument. If no DR found, an empty string is returned. The prophage ID can be in its full form: <NC accession number>:<dataset>:<prophage UID>, just <dataset>:<prophage UID> or only the unique identifier (UID) <prophage UID>. In the example below, using just '35531' as prophage ID would be sufficient:
$som = $ws->call('getDRs', ('prophId' => 'NC_002655:prophinder:35531'));
unless ($som->fault()) {
	my $DRs = $som->result();
	unless ($DRs) {
		print "No direct repeat found for this prophage.\n";
	} else {
		print "Direct repeats are:\n$DRs\n";
		# Extract each DR in a separate line:
		my @DRs = split(/\n/, $DRs);
		# Extract data for each DR:
		foreach my $dr (@DRs) {
			my @data = split(/\t/, $dr);
			print "Genomic coord left to prophage = ", $data[0], "\n";
			print "Genomic coord right to prophage = ", $data[1], "\n";
			print "DR length = ", $data[2], "\n";
		}
	}
}
A method has been developed in the Prophinder web service to return the GenBank entry with the detected prophages annotated in the genome. Call the getAnnotatedGbk with its accession number. Note that at present, the entire GenBank file is returned without compression. This can consume a lot of memory, in particular with Java. The method could change in the future by returning the GenBank file as a Gzipped file in attachment to the response.
$som = $ws->call('getAnnotatedGbk', ('accession' => 'NC_002655', 'wsizes' => ''));
unless ($som->fault()) {
	print "The annotated GenBank entry is:\n", $som->result(), "\n";
}
You can get the GenBank file annotated with prophages for specific window sizes, as described for the getProphages method.


Back