Recommended sites: BIO-SOFTWARE, BIO-MATRIX, BIO-WWW, GENBANK-BB and COMPUTATIONAL-BIOLOGY
Two bioinformaticians at the Laboratory for Foodborne Zoonoses, A. Villegas and P. Konczy, have written or adapted Perl code to create a suite of online GenBank flatfile (*.gbk) conversion programs. These are available at Online Analysis tools (http://molbiol-tools.ca/Convert.htm).
1. gbk2ptt - this will convert a GenBank flat file (*.gbk) to an NCBI Protein Table (*.ptt) file. The latter is a tab-delineated table of protein features.
2. gbk2faa - this will convert a GenBank flat file (*.gbk) to a FASTA file including the coding sequences (CDS) translated into amino acids (*.faa).
3. gbk2fna - this will convert a GenBank flat file (*.gbk) to a FASTA file of the whole genome (a single sequence; *.fna)
4. gbk2ffn - this will extract from a GenBank flat file (*.gbk) the DNA sequences of each gene which are presented in FASTA format (*.ffn). The program will also extract the features of your gbk file in EXCEL format (coordinates, strand (+/-), length of gene in nt, gene name, description, and any notes associated with the description. N.B. this program cannot deal with genes which are designated as follows: 125...250 join 500..725.
5. gbk2sqn - this will convert a GenBank flat file (*.gbk) to an NCBI Sequin submission (*.sqn) file. This program was designed to convert data generated in Kodon (Applied Maths, Austin, TX) to Sequin format. N.B. If using the "Bacterial and Plastid" genetic code, please note that the translations of certain CDS will appear /translation="-XXX...." In Sequin select the "Bacterial and Plastid" genetic code and translate to appear /translation="MXXX...."
Andrew M. Kropinski
Research Scientist & Program Lead, Host & Pathogen Determinants, Laboratory for Foodborne Zoonoses
Adjunct Professor , Microbiology & Immunology, Queen's University
Adjunct Professor, Molecular & Cellular Biology, University of Guelph
URL: Online Analysis Tools (http://molbiol-tools.ca)