In article <6q6jrt$h8p$1 at desdemone.pasteur.fr>,
Catherine Letondal <letondal at pasteur.fr> wrote:
>>Hi,
>>We maintain a copy of genbank (release + updates) as well as embl for
>blast 1.4 searches. The problem is that these databases have reached a
>size that exceeds a 32 bits integer capacity - and this version of blast is
>mainly based on such types of integer (more exactly 31 bits, for the
>integers are not unsigned). As a result, malloc of negative
>numbers occur.
>>Of course, we also have the blast 2 NCBI and Washington-Univ. versions.
>We are aware that this version of blast is not maintained anymore at NCBI, but
>we keep the "old" 1.4 version of blast for compatibility reasons
>with blast output parsers (like bob, tbob, blast2html, ...).
>>Updating blast 1.4 sources by replacing the 32 bits types by long integers
>seem very hazardous ...
>>Does someone here have the same problem, and some solution ?
You don't specify what operating system you are using. I haven't
looked at the BLAST 1.4 sources, but file offsets should always use
the type off_t, rather than int.
On IRIX at least, off_t is defined in <sys/types.h>.
If you use this, your program will automatically be able to handle
file size offsets as large as the OS can handle (for example in
Solaris 2.x for x<6, off_t is 32 bits, which is why Solaris 2.5 and
earlier had a maximum file size of 2 GB (approximately)).
Similarly, malloc actually takes an argument of type size_t, not int.
If you change the source to use that, you know that you will not
end up giving malloc negative numbers, since (at least on IRIX),
size_t is always an unsigned integer of some size (usually 32 or 64
bits, depending on your operating system).
Tim.
--
--------------------------------------------------------------------------
Dr T J R Cutts Tel: +44 1223 333596
Dept. of Biochemistry, 80 Tennis Court Rd.
Cambridge, CB2 1GA, UK