Two addenda on the C++ in biology posting:
1) An extensive system of C++ programs for manipulating E.coli genetic
data is described in:
DG Shin, C Lee, J Zhang, KE Rudd, and CM Berg. 1992. CABIOS 8:227-238
Redesigning, implementing, and integrating Escherichia coli genome
software tools with an object-oriented database.
2) My rudimentary C++ library is available from golgi.harvard.edu
in directory ftp/pub/contrib/kertools. Within are a number
of libraries for general sequence and GenBank flat-file
handling. Programs in the directory include ones for:
Manipulating data in GenBank flatfile format.
Translating DNA to protein, including extraction of ORFs
to separate sequences and generation of codon usage tables.
Detecting coiled-coil regions in proteins by the method
of Lupas et al (Science 252:1162-1164).
Condensing output from NCBI's BLAST
Detecting repetitive DNA sequence regions by the method of
Tautz et al (Nature 322:652-656) [soon - coded but not tested].
These tools and the underlying class library are still in a very fluid
state, but if you send me an E-mail note I will keep you informed as
to updates and changes. All feedback will be appreciated.
Keith Robison
Harvard University
Program in Biochemistry, Molecular, Cellular, and Developmental Biology
robison at ribo.harvard.edu