IUBio

Problems with BLAST formatdb

Armin Ollig ollig at biofrontera.de
Mon Nov 15 13:36:30 EST 1999


Hi Gary,

Gary Williams wrote:
> 
> I wonder if anyone can help me:
> 
> We format our own BLAST databases and are interested in working with formatdb's
> new '-v' option.
> 
> It is not clear what number should be used for the -v option.  With a
> database of about 4 Mb, any number greater than 1 seems to produce one
> volume.
> 
> What should the number be?

No idea, since my version does not have that switch (VERSION is "Mon May
10 15:18:43 EDT 1999")

So what does this switch do ?

> 
> The greater problem is that when I try to use formatdb on a database of
> size greater than 2 Gb (actually 2.1 Gb) it gives the following report
> in formatdb.log and the index files are created with zero size.
> 
> Version 2.0.10 [Aug-26-1999]
> Started database file "htg"
> NOTE: CoreLib [002.003] FileOpen("htg","r") failed
> 
> I am using the executable for Solaris (version 2.0.10), as provided on
> NCBI's FTP site, on a SUN Enterprise 6000 running Solaris 2.7
> 
> Has anyone has any success in using formatdb on SUNs with files over 2 Gb?

Yes this is a common problem. 
This is due to the system call in blast used to access files. Read
largefile(5) (man largefile) in Solaris for a further description of
this problem. 
Solaris 2.7 is a fully 64bit system and it should be easy for NCBI to
make a 64bit Solaris 2.7 version of blast that can handle files larger
than 2^31 bytes. As a side effect such a blast could easily handle
databases of more than 2^32 sequences (ESTs plus GenEMBL). It is also
possible to make blast largefile aware even on a 32bit solaris systems
(<= Solaris 2.6). Since all these things must happen in the source code
ask NCBI for this.

We use a SGI-special version of blast (64bit) with several other
enhancements for better performnce. 

best regards,
--Armin


> 
> Many thanks,
> 
> Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
> mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
> Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK
> ---

--
"To save energy
    the light at the end of the tunnel
         will temporarily be switched off."





More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net