Software to read whole genomes?

Keith James k.james at bangor.ac.uk
Thu Sep 25 09:53:26 EST 1997


>>>>> "Duncan" == Duncan Clark <duncan at genesys.demon.co.uk> writes:

    Duncan> Hi Folks, I ran into a problem the other day. I wanted to
    Duncan> locate a gene in a newly sequenced genome (in the public
    Duncan> domain but not totally finished such that orfs are
    Duncan> available) that a TIGR online blast search gave homology
    Duncan> to. The blast search gave the location within the genome
    Duncan> but the genome is only available (at present) as one long
    Duncan> ascii file with no numbering.  Have you ever tried to find
    Duncan> base 1,600,000 in a 3,000,000 bases? Is their any software
    Duncan> out there that will renumber it or a macro for MS Word
    Duncan> that will do it?


    Duncan> So any software that will import one long sequence,
    Duncan> preferably free (running under windows or a Mac emulator)
    Duncan> 'cso I only need to do this once in a blue moon.

I'd use a text editor rather than Word. One Windows, something like NTEmacs,
Textpad or Programmers' File Editor (free, shareware and free, respectively).

Eg. Textpad: Search|Goto|[line|column|page|byte|bookmark]

If you read it in as acsii, goto byte 1,600,000. The limit on file size is
just memory, so you can open a big file. You could also linewrap at, say,
100 characters and goto line 160,000.

For a more elegant solution I'd knock out a Perl script

- -- 
Keith James Ph.D. - k.james at bangor.ac.uk - finger k.james at thunder.bangor.ac.uk 
  Biodegradation Group - School of Biological Sciences - University of Wales
 PGP preferred when your e-mail is not for 3rd parties: finger for Public Key
  Public Key fingerprint = 3A 85 BE 01 A5 40 E2 53  5B 15 27 0D 53 AC 2D E6

Version: 2.6.3i
Charset: noconv
Comment: *encrypt and survive*


More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net