BLAST 2.2.26+ Installation instructions
Additional Libraries
- No additional library is required
Procedure
For the past few years, there have been two NCBI BLAST versions available: BLAST and BLAST+, a full rewrite of the cousrce code using C++. The regular BLAST was preferred because it was more compact; however, the BLAST development team has stopped all work on the old BLAST. Since the Impilo policy is to use the latest version, BLAST+ is now the version used in Impilo.
Another choice had to be made: source or binary? Again, the criteria was occupied space and the binary occupies much less space.
Here is the procedure used to install BLAST, from archive to executable:
- Download the appropriate archive in
/home/bioubuntu
, decompress it and move the resulting folder in/opt/bio/sources/
.
% wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.26/ncbi-blast-2.2.26+-x64-linux.tar.gz % tar -zxvf ncbi-blast-2.2.26+-linux.tar.gz % sudo mv ncbi-blast-2.2.26+ /opt/bio/sources
- The
/opt/bio/sources/ncbi-blast-2.2.26+
folder should belong toroot
and its permissions set to755
.
% sudo chown -R root:root /opt/bio/sources/ncbi-blast-2.2.26+
- Because the applications are in
/opt/bio/sources/ncbi-blast-2.2.26+/bin
, this needs to be added toPATH
. There are many ways to do so but I decided to put this information in/etc/profile.d/impilo.sh
.
% sudo nano /etc/profile.d/impilo.sh
- At the very end of the file, you need to add the following lines:
# # BLAST specific environment variables # export PATH=$PATH:/opt/bio/sources/ncbi-blast-2.2.26+/bin
- The last thing to do is to add location information for the sequence databases used by BLAST, in the same file:
export BLASTDB=/opt/bio/data/blast_db
Creating sequence databases to be used by BLAST
BLAST uses specialy formatted database files, usually created from source files written in FASTA
format. There are no strict location for these files (That's what the export BLASTDB…
line in /etc/profile.d/impilo.sh
is for) but to make it cleanly, Impilo puts these files in /opt/bio/data
, more precisely, /opt/bio/data/blast_db
. Because we want to keep things small, only two small sequence files, derived from E. coli DH10, are provided for teaching:
Here is a general recipe to create these database files.
- First, since only
root
can write into/opt/bio/data/blast_db
, il faut devenirroot
:
% sudo su -
- We navigate to locate ourself in
/opt/bio/data/blast_db
and we move theFASTA
files into it:
% cd /opt/bio/data/blast_db % mv /home/bioubuntu/this_is_an_example.fa .
- The
makeblastdb
program will create the database file. The selection of the type of datanase is done via the-dbtype
parameter withprot
(amino acid) ornucl
(nucleotide) as values:
% makeblastdb -in <this_is_an_example.fa> -dbtype <source> -out <name_local_db> -title <name_local_db>
- You can test your new DB with
blastn
orblastp
:
% blastp -db <name_local_db> -query <your_sequence>