>Date: 16 Jan 91 17:52:30 GMT
>From: nadkarni-prakash at cs.yale.edu>Subject: Request for contig building/clone overlap detection algorithms
>>I am looking for references to published algorithms - or programs
>written in C or C++ - which, given a bunch of clones (of known
>sequence) which are fragments of a larger DNA sequence, perform
>pairwise overlapping clone detection, and possibly go on from there to
>try to build a contig from this clone set. There is such a program
>written for the Cray by D.Tourney at Los Alamos- unfortunately, we
>don't have access to a Cray, and the rest of our software is C-based,
>so a description of the algorithm might be more useful for us.
>>An algorithm which is reasonably tolerant of experimental error in the
>determination of the sequences of the individual clones would be
>preferred.
>>I presume the algorithm for overlap detection would be partly based on
>the dynamic programming methods of standard sequence comparison
>algorithms (Smith-Waterman, FASTA etc.) and go on from there- but I'd
>appreciate specifics.
>>I understand that the program itself will be extremely time-consuming
>(we use a parallel programming environment called C-Linda to run our
>programs on multiprocessor UNIX machines).
>>Is there anyone at Los Alamos who can help out with this one?
>>Please send a reply to nadkarni at cs.yale.edu>>Thanks a lot.
>>Prakash Nadkarni
>Yale Medical School
>
I think there is already a program called SIM written in C available. It was
claimed to be fast, and it is written for the Unix machine. It might be
obtained by ftp or e-mail request from genbank.bio.net.
Also I think that it is the operating system which is more important than the
language in which the program is written. The SIM, mentioned above could not
be run on a VAX machine with the VMS operating system would be a good example.
For your specific purpose, I think you need to add something more onto this
SIM, so the sequence comparison results can be aligned and written in a desired
format to faciltate the later editing and compiling.
Please notify me your result.
Good luck!
Tao Tao
Department of Microbiology
Medical College of Ohio at Toledo