Translations of this page:
Trace: blast_2222

Installation of BLAST 2.2.22

Installation of BLAST 2.2.22

Additional libraries

None needed

Procedure

For the past couple of years, there has been two versions of NCBI BLAST in the wild: BLAST and BLAST+, a complete rewrite of BLAST in C++. Although it has many advantages (easier to use and faster among other things), BLAST+ has this big problem: it consumes a hell of lot more space than BLAST, which is a problem for a VM-based distro like Impilo which has as a goal to keep its footprint small. Because of that, I chose to use and install the original BLAST. This will probably change in the near future though just because I want to put the best tools in Impilo…

Another choice needed to be done: should I use source or pre-compiled binaries? Because the source code has a lot of stuff pertaining to GUIs and other extra libraries, I decided to use pre-compiled binaries.

I also installed the corresponding NETBLAST, a BLAST network-based client since we might not always have the databases locally for the courses ;-)

Here is my procedure to install BLAST/NETBLAST from an archive that has the pre-compiled binaries:

  • Let's download the appropriate archives in /home/bioubuntu (one would choose either blast-2.2.22-ia32-linux.tar.gz or blast-2.2.22-x64-linux.tar.gz based on the fact that we want to build either a 32-bit or 64-bit Impilo), extract its content and move this material in /opt/bio/sources/.
% wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/2.2.22/blast-2.2.22-x64-linux.tar.gz
% wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/2.2.22/netblast-2.2.22-x64-linux.tar.gz
% tar -zxvf blast-2.2.22-x64-linux.tar.gz
% tar -zxvf netblast-2.2.22-x64-linux.tar.gz
  • The folder named /opt/bio/sources/blast-2.2.22 should belong to root and its permissions should be 755.
% sudo chown -R root:root /opt/bio/sources/blast-2.2.22
% sudo chown -R root:root /opt/bio/sources/netblast-2.2.22
  • Since the applications that we need are under /opt/bio/sources/blast-2.2.22/bin or /opt/bio/sources/netblast-2.2.22/bin, we need to add these two locations to our PATH. There is more than one way of doing this but I choose to put this in /etc/profile.
% sudo nano /etc/profile
  • At the very end of the file, add the following lines:
#
# BLAST/NETBLAST specific environment variables 
#
PATH=/opt/bio/sources/blast-2.2.22/bin:/opt/bio/sources/blast-2.2.22/bin:$PATH
  • The last thing to do is to create a text file named .ncbirc inside the bioubuntu home folder. This file must contain the following lines:
[NCBI]
DATA=/opt/bio/sources/blast-2.2.22/data
 
[BLAST]
BLASTDB=/opt/bio/data/blastdb
BLASTMAT=/opt/bio/sources/blast-2.2.22/data

Creating the BLAST databases

BLAST uses specially formatted databases created from text files written in FASTA format. There is no specific places where these databases should be (This is why you have this BLASTDB… line inside the .ncbirc file) but to put some type of order on this Impilo puts all data files used by the various applications under /opt/bio/data, in this case, /opt/bio/data/blastdb. Because space is a premium in this project, I only provide two small databases created from nucleotide and amin acid sequences from the E. coli DH10B bacterium:

Here is the recipe used to create them. Apply it for any FASTA formatted file that you want to turn into a BLAST database.

  • First, since only root can write into /opt/bio/data/blastdb, you need to become root:
% sudo su
  • Navigate toward /opt/bio/data/blastdb and move any FASTA formatted file there. They will be use to create a new database:
% cd /opt/bio/data/blastdb
% mv /home/bioubuntu/this_is_example.fa .
  • The application named formatdb will be in charge of creating the database. The database type (nucleotide or amino acid) is selected using the -p flag with either the T (amino acid) or the F(nucleotide) as parameter:
% formatdb -i <this_is_example.fa> -p T -n <local_db_name>
  • You can test this new database usingblastall:
% blastall -p blastp -d <local_db_name> -i <your_sequence>