In article <36DD77A4.A62F9B42 at uci.edu>, Harry Mangalam <mangalam at uci.edu> writes:
>Hi - I've been asked to eval setting up a local BLAST server. For
>compatibility with other requirements, the OS will be Linux, unless
>someone can give me extremely good reasons not to.
>>I'm considering using a dual Pentium II/Celeron, unless I can get
>feedback that BLAST does not work well on such a system, in which case
>we would run either a uni-PII/Celeron or if more $ can be scraped up, an
>Alpha (but having experience with Linux/Alpha, I know some of the
>problems with that approach).
>>So, does anyone out there have experience with BLAST on an SMP Intel box
>(dual or quad (PPro or Xeon - I know that PII's can't go there) and if
>so, on what kernel and what were the gotchas?
We've BLAST locally on a Dual PII 400 in a machine we got from VAResearch.
The kernel is close to RedHat 5.1. VAR put in some stuff to support the
U2W drives, which were bleeding edge technology at the time we bought it.
There is a thread parameter for BLASTALL, so it at least claims that it can
use more than one processor at a time, but the PIIs can only go to dual
processor in any case.
>>Also rec's on minimum memory for maximum efficiency would be
>appreciated.
Lots. Running a query just against Genpept I've seen the blast process go
over 128 Mb. Our server has 256 Mb in it, so I didn't try running two
BLAST jobs at once, lest it bog down swapping virtual in and out. There
may be a way to restrict memory safely, but blastall doesn't have an
explicit switch for doing so.
Linux caches what it can automatically. If this is to be a pure BLAST
server, and you can buy enough memory to keep the desired databases
entirely in memory, it will run much faster. The effect shows up for FASTA
too, run it twice in a row against small database and it will be much
faster the second time. Pure memory caching is feasible for Swiss-Prot, or
maybe Genpept. Seems like a lot of money if you want Genbank to stay in
memory though!
The biggest job we've run on this box was 16000 BLASTP sequences against
Genept, which took 9 days, two hours, and some change, to complete.
We don't have the disks striped yet, that should speed things up a bit.
Even so, the U2W disks are pretty fast.
Regards,
David Mathog
mathog at seqaxp.bio.caltech.edu
Manager, sequence analysis facility, biology division, Caltech