In article <9310292144.AA27674 at farber.harvard.edu>, morrison at FARBER.HARVARD.EDU (Paul Morrison) writes...
>Harry Mangalam, VCO/Micro+Mol Genetics, Irvine Hall, Coll of Med, UC
>replied to kinneya at esvax.dnet.dupont.com:
>>>In article <9310251805.AA10757 at esds01.es.dupont.com>,
>>kinneya at esvax.dnet.dupont.com wrote:
>>>>>> >I'm looking for Macintosh-based software to help us edit ABI-format
>>> >sequence files. I would like something like the unix-based TED, which
>>> >prints the ABI-called sequence beneath the DNA migration absorption
>>> >curves and allows you to make adjustments.
>>>>>> The "EDITSEQ" sequence editor software for the Mac from DNAStar (which is
>>> also part of their Lasergene package) will do this.
>>>>>> Their phone number is (608)258-7420
>>>> So will Sequencher, from Gene Codes (and is very expensive if all you're
>>looking for is an editor), but is a complete, powerful, and very easy to
>>use fragment assembly program. Their phone number is: 313 769 7249; ~$1200
>>w/ hardware lock, site/multiple copy discounts.
>> IMHO, Sequencher is the much better of the two for editing and fragment
>>assembly, but it is not a 'full-featured' Sequence analysis pkg ie no
>>protein analysis, not a lot of DNA analysis, which Lasergene does offer.
>>>>>>Std Disclaimers..
>>Harry
>>>>Yes indeed there is more than one contig builder out there and I agree
>completely with Harry that Sequencher is the most powerful. One thing that
>caught my attention in the original post was the wording "allows you to
>make adjustments". Now if that is meant to mean make adjustments to the
>called sequence by comparing to other sequences then yes, the programs
>mentioned do this. If it means make adjustments to the raw data data signal
>that comes out of a 373A so that one might be able to tweak paramaters so
>that more sequence is called, I don't think that program exists. It becomes
>evident after using the 373 for awhile that a program that would either
>circumvent or enhance the algorithm in ABI's analysis program might be a
>useful tool. What do I want this tool to do? 1) Allow me to get inside and
>change the base space number if I want. 2) Figure a way to get around the
>"all DNA is 25%A25%G25%C25%T" problem. These are _easy_ fixes. Others are
>hard. I hear rumours and papers of neural net analysis of this data.
>Anybody out there working on something that looks good?
>>Paul Morrison Dana1030
>Molecular Biology Core Facility
>Dana Farber Cancer Institute
>44 Binney Street
>Boston, MA 02115
Hi,
Sorry to include all of the above with this reply but thought
the thread was worth including. The problem of improving base calling
from the ABI is non-trivial. We and others have been trying to do just
that for what seems like years now. ABI also has been working on the
problem for quite a while also and I'd look for something to surface
soon. The bottom line is THERE ARE NO "_easy_ fixes" to this problem.
Clark Tibbets at Vanderbilt has done lots of work with neural
nets for DNA sequence analysis and demonstrated a working version of
his program at the Hilton Head Genome Sequencing meeting just last week.
He's published extensively on the subject and his papers make interesting
reading. He's trained a neural net to develop the algorithms that allow
for (according to what I saw) more accurate (when compared to ABI's) base
calls within the range from the priming site out to about 400 bases. It's
pretty cool stuff and he may have something there.
Several other groups have looked at the ABI-lane files, and allow
one to proof read the data once ABI has manipulated it. Sequencher,
DNAStar, and SeqEd are three programs on the Mac that allow the user to
re-interpret the data (as discussed by others). On the SunSparcStation,
the Staden, et al TED (trace editor) and XBAP also allow such manipulations.
These 4 programs all provide the user with access to already manipulated
data and make changes they feel are appropriate.
As for dealing with the raw-raw data (the approx. 20 meg gel file),
the new ABI Analysis program, Clark's program, our program (which is not yet
ready for prime time), and others which are at even earlier stages, may
produce something that is better than the existing ABI Analysis program.
I could go on almost forever with the list of issues that need to be
addressed before a new and improved base calling algorithm becomes available.
The major bottleneck is that the sample spacing eventually reaches a point
where more than one band is in the window at the same time. This causes
problems in assigning base calls at longer read lengths, especially when
the signal is usually lower the further out you get. Longer gel plates
and a more focussed lens can help resolve this data and several years
ago I presented data at the Cold Spring Harbor Human Genome meeting showing
data that was manually read-able out to around 750 bases using 60 cm plates
and a 2x more focussed lens. ABI now has come out with a lowered optical
stage, is testing longer gel plates, and attempting to improve their
base calling algorithm. Data presented at the Hilton Head meeting was
very exciting.
So things along this line are moving, slowly but still moving
forward. Hang in there and in the future (hope sooner than later),
some better base calling options will become available. In the mean
time, you could contact ABI and obtain their "tool-kit" that is an object
library that will allow you to manipulate the data yourself. It's free
but you have to sign an agreement with them not to distribute your code.
You also could contact Clark and get a copy of his neural net program.
Cheers to one and all.........bruce
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
\ Bruce A. Roe Professor of Chemistry and Biochemistry /
/ Dept. of Chem. and Biochem. INTERNET: BROE at aardvark.ucs.uoknor.edu \
\ University of Oklahoma BITNET: BROE at uokucsvx /
/ 620 Parrington Oval, Rm 208 AT&TNET: 405-325-4912 or 405-325-7610 \
\ Norman, Oklahoma 73019 FAXnet: 405-325-6111 /
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -