IUBio

question using SRS's web interface

Andrew Dalke adalke at mindspring.com
Tue Jan 7 07:20:27 EST 2003


I'm having some problems understanding how to use the web interface
to SRS.  I just ran through an Entrex tutorial at

http://healthlinks.washington.edu/hsl/liaisons/yarfitz/EntrezTutorial/index.html

and decided to use some of the same examples, to get a feel for
the differences between SRS and Entrez.

I want to get "Nucleotide" record "X61499".  With NCBI's ExPASy
this returns a single hit,

X61499 Links
     H.sapiens mRNA for NF-kB subunit
     gi|35041|emb|X61499.1|HSNFKBSU[35041]

How do I do the same on an SRS server?

I went to srs.ebi.ac.uk to find publically available GenBank servers.
That took me to http://downloads.lionbio.co.uk/publicsrs.html

I choose IUBIO (19808101), at

http://iubio.bio.indiana.edu/srs6bin/cgi-bin/wgetz?-page+LibInfo+-lib+GENBANKRELEASE

Then "TOP PAGE" so I could do a query.

Select GENBANK and GENPEPT.  (Both, to be on the safe side)

Enter "X61499" in the "Quick Search" box, and press the "Quick Search" 
button.

Here are the hits:
   GENBANK:AE015854    <-- because of note "similar to GB:X61498 ..."
   GENBANK:HSCD85703   <-- don't know why there was a match
   GENBANK:HSPA18H7    <-- don't know why there was a match
   GENPEPT:AE015854_2  <-- because of note "similar to GB:X61498,
                                               GB:X61499 ..."
   GENPEPT:X61499_1    <-- contains ACCESSION X61499, so this make sense

The last of these links back to GI:35042.  (Note that NCBI's link
is for GI:35041)

I follow the hyperlink to GI:35042 and get
    no entries found query: "[genbank-GID:35042]"

I manually changed the URL to point to 35041 and get ..
   H.sapiens flow-sorted chromosome 6 HindIII fragment, SC6pA18H7.
which is quite wrong.


I did a description search for "mRNA for NF-kB subunit".  It
took a long time (maybe five minutes?) and came back with 4 links.
And the descriptions were for:

* GENBANK:HSPA18H6 H.sapiens flow-sorted chromosome 6 HindIII
      fragment, SC6pA18H6.
* GENBANK:HSPA18H7 H.sapiens flow-sorted chromosome 6 HindIII
      fragment, SC6pA18H7.
* GENBANK:CRH406200 Carlia rhomboidalis partial mitochondrial
      cytb gene for cytochrome b, sample CJS700.
* GENBANK:CRH406201 Carlia rhomboidalis partial mitochondrial
      cytb gene for cytochrome b, sample CJS701.

Note that one of them matched earlier for X61499.  Were I to
guess I would say there's an off-by-one error in the index?

I then tried Pasteur at
   http://srs.pasteur.fr/cgi-bin/srs6/wgetz?-page+top+-id+5Ajh91KEYoX

Again, GenPept and GenBank quick search for "X61499"

   GENBANK:AE015854    <-- note about "similar to"
   GENBANK:BI339638    <-- don't know why this is here
   GENBANK:BI357633    <-- don't know why this is here
   GENBANK:AE015854    <-- note about "similar to"
   GENPEPT:AE015854_2  <-- note about "similar to"
   GENPEPT:X61499_1    <-- obvious
   GENPEPT:AE015854_2  <-- note about "similar to"

Again, I looked at GENPEPT:X61499_1 to see if there's a link
to the nucleotide.  It says

LOCUS       X61499_1 [HSNFKBSU]
DEFINITION  H.sapiens mRNA for NF-kB subunit.
DATE        29-APR-1992
ACCESSION   X61499
ORGANISM    Homo sapiens
             Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 
Euteleostomi;
             Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo.
COMMENT     CDS  164..1411
             /product="NF-kB subunit"
             /protein_id="CAA43716.1"
             /db_xref="GI:35042"
             /db_xref="SWISS-PROT:Q04860"
WEIGHT      45548
LENGTH      415



Note that the /db_xref="GI:35042" has the GI:35042 hyperlinked, so
I follow it.  The resulting page contains

Entry Name       GENBANK:BI357633
Accession Number BI357633
NID              15052079
Sequence Version 1
Division         EST
Molecule         mRNA
Date             31-JUL-2001
       Description
Source           Drosophila melanogaster (fruit fly)
Organism         Drosophila melanogaster Eukaryota; Metazoa; Arthropoda; 

                  Hexapoda; Insecta; Pterygota; Neoptera; Endopterygota;
                  Diptera; Brachycera; Muscomorpha; Ephydroidea;
                  Drosophilidae; Drosophila.
Keywords EST
Description      RE44159.5prime RE Drosophila melanogaster normalized
                  Embryo pFlc-1 Drosophila melanogaster cDNA clone
                  RE44159 5 similar to CG14819: FBan0014819 located on:
                  X 2A3-2A3;: 05/13/2001, mRNA sequence.
    ...
Length: 593

I don't see how a nucleotide of length 593 can make a protein of
length 415.  I also don't see how the source of a human protein
comes from Drosophila.

What's going on?

					Andrew
					dalke at dalkescientific.com





More information about the Bio-srs mailing list

Send comments to us at biosci-help [At] net.bio.net