IUBio

REQ: PC software to extract sequences from SWISSPROT database in fasta format

Keith James k.james at bangor.ac.uk
Wed Jul 30 12:45:54 EST 1997


Phillip Robinson <phrobins at mail.usyd.edu.au> writes:

> 
> Is there a program that runs on a Windows-based PC that can search the
> swissprot database and extract a sequence into an ascii file?  Using
> either the accession number, name, or part of the sequence.
> 
> I have the protein database in fasta format (29MB) on my PC and
> searching it manually with only text editors is quite tedious, to say
> the least.  When I am not connected to the net, the capability to search
> my own copy would be invaluable.

To extract sequences from a fasta format Swissprot database you can use
the 'fetch' Perl script. It can build indices on your own sequence
databases too. Version 1.3 is at 
http://ind5.mrc-lmb.cam.ac.uk/fetch_home.html (It is not the same 'fetch'
as supplied in the GCG package)

You will need to obtain a copy of Perl to use this, but Perl is so much
fun you will probably want it anyway ;) See www.activeware.com for the
most recent build of Perl-win32. You will need either Win95 or NT to use
it.

If your database is in fasta format you can search using the fasta package
which is available for DOS-based PCs. Our PC is using the Unix version, so
I can't tell you whether you will be able to obtain binaries or have to
compile it yourself. This will allow library searches, local homology
searches and statistical comparisons. I'm afraid I don't have a URL for
the package to hand.

-- 
Keith James Ph.D. - k.james at bangor.ac.uk  PGP 2.6.2i  Key ID 469A9FA1
Biodegradation Group                         *Encrypt and Survive*  
School of Biological Sciences    Guvf znl znxr ab frafr vs bar bs gur
University of Wales, Bangor, UK  vasvavgr ahzore bs zbaxrlf vf bssyvar




More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net