* ReadSeq -- 14 Nov 91
*
* Reads and writes nucleic/protein sequences in various
* formats. Data files may have multiple sequences.
Readseq has been updated. There have been several bug corrections
and a number of enhancements (see below). If you are using earlier
versions (or programs you use use it), I recommend you update to
this release.
Readseq is particularly useful as it detects many sequence formats;
this detection has been improved. Validation tests are included
so you can ensure that the current program has compiled and is working
properly.
If you use it with either GCG format or Gary Olsen VMS sequence editor
format, you should definitely update your copy:
:( Previous versions delete bases from every Olsen print format file.
:( Previous versions can duplicate the bases in the last line of a GCG
format file. This will occur when the GCG format file was previously
_written_ by readseq, then read in a second time. GCG format files
written by GCG programs were not subject to this flaw.
This program is available thru anonymous ftp, in this manner:
my_computer> ftp ftp.bio.indiana.edu (or IP address 129.79.224.25)
username: anonymous
password: my_username at my_computer
ftp> cd molbio/readseq
ftp> get readseq.shar
ftp> bye
readseq.shar is a Unix shell archive of the readseq files.
This file can be editted by any text editor to reconstitute the
original files, for those who do not have a Unix system or an
Unshar program. Read the top of this .shar file for further
instructions.
There are also pre-compiled executables for the following computers:
Silicon Graphics Iris, Sparc (Sun Sparcstation & clones), VMS-Vax,
Macintosh. Use binary ftp to transfer these, except Macintosh. The
Mac version is just the command-line program in a window, not very
handy.
File conversions handled by readseq:
1. IG/Stanford 8. Pearson/Fasta
2. GenBank/GB 9. Zuker
3. NBRF/PIR 10. Olsen (in only)
4. EMBL 11. Phylip3.4/Phylip (out only)
5. GCG 12. Phylip3.3/Interleaved (out only)
6. DNAStrider 13. Plain/Raw
7. Fitch
Recent changes:
17 Oct 91.
* corrected bug in reading Olsen format
(serious-deletion)
10 Nov 91.
* corrected bug in reading some GCG format files
(serious-last line duplicated)
+ add format name parsing (-fgenbank, -ffasta, ...)
+ Phylip v3.4 output format (== v3.2, sequential)
+ add checksum output to all forms that have document
+ skip mail headers in seq file
+ add pipe for standard input == seq file (with -p)
* fold in parts of MacApp Seq object
* strengthen format detection
* clarify program structure
* remove fixed sequence size limit (now dynamic, sizeof memory)
* check and fold in accumulated bug reports:
* Now ANSI-C fopen(..,"w") & check open failure
* Define -DFIXTOUPPER for nonANSI C libraries that mess
up toupper/tolower
= No command-line changes; callers of readseq main() should be okay
- ureadseq.h functions have changed; client programs need to note.
+ added Unix and VMS Make scripts, including validation tests
This program may be freely copied and used by anyone.
Developers are encourged to incorporate parts in their
programs, rather than devise their own private sequence
format.
This should compile and run with any ANSI C compiler.
Please advise me of any bugs, additions or corrections.
-- Don
--
Don Gilbert gilbert at bio.indiana.edu
biocomputing office, biology dept., indiana univ., bloomington, in 47405