For the past few years, there have been two NCBI BLAST versions available: BLAST and BLAST+, a full rewrite of the cousrce code using C++. The regular BLAST was preferred because it was more compact; however, the BLAST development team has stopped all work on the old BLAST. Since the Impilo policy is to use the latest version, BLAST+ is now the version used in Impilo.
Another choice had to be made: source or binary? Again, the criteria was occupied space and the binary occupies much less space.
Here is the procedure used to install BLAST, from archive to executable:
/home/bioubuntu
, decompress it and move the resulting folder in /opt/bio/sources/
.% wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.26/ncbi-blast-2.2.26+-x64-linux.tar.gz % tar -zxvf ncbi-blast-2.2.26+-linux.tar.gz % sudo mv ncbi-blast-2.2.26+ /opt/bio/sources
/opt/bio/sources/ncbi-blast-2.2.26+
folder should belong to root
and its permissions set to 755
.% sudo chown -R root:root /opt/bio/sources/ncbi-blast-2.2.26+
/opt/bio/sources/ncbi-blast-2.2.26+/bin
, this needs to be added to PATH
. There are many ways to do so but I decided to put this information in /etc/profile.d/impilo.sh
.% sudo nano /etc/profile.d/impilo.sh
# # BLAST specific environment variables # export PATH=$PATH:/opt/bio/sources/ncbi-blast-2.2.26+/bin
export BLASTDB=/opt/bio/data/blast_db
BLAST uses specialy formatted database files, usually created from source files written in FASTA
format. There are no strict location for these files (That's what the export BLASTDB…
line in /etc/profile.d/impilo.sh
is for) but to make it cleanly, Impilo puts these files in /opt/bio/data
, more precisely, /opt/bio/data/blast_db
. Because we want to keep things small, only two small sequence files, derived from E. coli DH10, are provided for teaching:
Here is a general recipe to create these database files.
root
can write into /opt/bio/data/blast_db
, il faut devenir root
:% sudo su -
/opt/bio/data/blast_db
and we move the FASTA
files into it:% cd /opt/bio/data/blast_db % mv /home/bioubuntu/this_is_an_example.fa .
makeblastdb
program will create the database file. The selection of the type of datanase is done via the -dbtype
parameter with prot
(amino acid) or nucl
(nucleotide) as values:% makeblastdb -in <this_is_an_example.fa> -dbtype <source> -out <name_local_db> -title <name_local_db>
blastn
or blastp
:% blastp -db <name_local_db> -query <your_sequence>