On 2006-07-17, Brannon <brannonking at yahoo.com> wrote:
> I'm confused on BLAST file formats and somewhat on the BLAST tool
> structure itself. I have no experience with BLAST, but I recognize
> BLAST can read several input formats including FASTA.
>> Assume I'm using the latest version of BLAST. It seems to me there
> would be three file stages. First would be the input files to be
> processed with some heuristical program. Second would be the output
> files from that tool; these output files would also be the input files
> to a tool that would produce the exact alignment. So the third stage
> files would be the alignment files themselves. Is that even remotely
> close to reality?
>> What I really want to know is the file format of the stage two files --
> the output of the BLAST tools before they do the sequence alignment.
> Where do I get that information?
There are two different versions of BLAST, with two different file structures.
There is "wu-blast" from Washington University, and NCBI Blast from NCBI.
I believe that the NCBI version handles bigger databases and has been
upgraded more assiduously than the wu-blast version. I used to use
both, but have switched to using exclusively NCBI blast.
The formatdb command converts fasta files to a bunch of files
(different formats for nucleic acids and proteins).
I found formatdb.html on the web with the following information:
DISCLAIMER: The internal structure of the BLAST databases is
subject to change with little or no notice. The readdb API should
be used to extract data from the BLAST databases. Readdb is part
of the the NCBI toolkit
(ftp://ncbi.nlm.nih.gov/toolbox/ncbi_tools/), readdb.h contains a
list of supported function calls.
(the double "the" is in the original)
------------------------------------------------------------
Kevin Karplus karplus at soe.ucsc.eduhttp://www.soe.ucsc.edu/~karplus
Professor of Biomolecular Engineering, University of California, Santa Cruz
Undergraduate and Graduate Director, Bioinformatics
(Senior member, IEEE) (Board of Directors & Chair of Education Committee, ISCB)
life member (LAB, Adventure Cycling, American Youth Hostels)
Effective Cycling Instructor #218-ck (lapsed)
Affiliations for identification only.