protein homology search

Ola Myklebost olam at radium.uio.no
Tue Oct 25 05:15:50 EST 1994

In article <paul_b-201094094729 at clone2.mcb.uconn.edu>,
paul_b at biotek.mcb.uconn.edu (paul betts) wrote:

> Does anyone out there know of a database, or software to search a protein
> sequence database, or any other strategy that will allow a search based on
> protein molecular weight and/or pI?

What about the MOWSE server?

I believe I got this info and more by sending a mail with the line "help"
to mowse at dl.ac.uk (Mowse Server).


Precedence: first-class
Date: Wed, 13 Oct 93 10:58:55 +0100
From: mowse at dl.ac.uk (Mowse Server)
Apparently-To: ola.myklebost at labmed.uio.no

                The MOWSE peptide mass database:

                Imperial Cancer Research Fund


                SERC Daresbury Laboratory

                D.J.C. Pappin, P. Hojrup and A.J. Bleasby
                'Rapid Identification of Proteins by
                Peptide-Mass Fingerprinting'.
                Current Biology (1993), vol 3, 327-332.

                InterNet server version:

Table of Contents:

        [1] Background.

        [2] Construction of the MOWSE database.

                [2.1] Source database.
                [2.2] Calculation of Molecular weight fragments.

        [3] Running database searches via e_mail.

        [4] Example of mail query format.
        [5] Results listing.

        [6] Database structure.

                [6.1] MOWSE database structure.
                [6.2] The MW primary fragment molecular weight file.
                [6.3] The MDX file OWL entry index.
                [6.4] The SMW whole sequence molecular weight file.
                [6.5] Program Requirements.
                [6.6] MOWSE Scoring scheme.
                [6.7] Simulation studies.

        [7] General references.

[1] Background:

        Determination of molecular weight has always been an 
important aspect of the characterization of biological molecules. 
Protein molecular weight data, historically obtained by SDS gel 
electrophoresis or gel permeation chromatography, has been used 
establish purity, detect post-translational modification (such as 
phosphorylation or glycosylation) and aid identification. Until 
just over a decade ago, mass spectrometric techniques were typically 
limited to relatively small biomolecules, as proteins and nucleic 
acids were too large and fragile to withstand the harsh physical 
processes required to induce ionization. This began to change with 
the development of 'soft' ionization methods such as fast atom 
bombardment (FAB)[1], electrospray ionisation (ESI) [2,3] and 
matrix-assisted laser desorption ionisation (MALDI)[4], which can 
effect the efficient transition of large macromolecules from 
solution or solid crystalline state into intact, naked molecular 
ions in the gas phase. As an added bonus to the protein chemist, 
sample handling requirements are minimal and the amounts required 
for MS analysis are in the same range, or less, than existing 
analytical methods.
        As well as providing accurate mass information for intact 
proteins, such techniques have been routinely used to produce 
accurate peptide molecular weight 'fingerprint' maps following 
digestion of known proteins with specific proteases. Such maps 
have been used to confirm protein sequences (allowing the 
detection of errors of translation, mutation or insertion), 
characterise post-translational modifications or processing events 
and assign disulphide bonds [5,6]. 
        Less well appreciated, however, is the extent to which such 
peptide mass information can provide a 'fingerprint' signature 
sufficiently discriminating to allow for the unique and rapid 
identification of unknown sample proteins, independent of other 
analytical methods such as protein sequence analysis. 
        The following text describes the construction and use 
of the MOWSE peptide mass database (for MOlecular Weight SEarch) 
at the SERC Daresbury Laboratory. Practical experience has shown 
that sample proteins can be uniquely identified using as few as 3-
4 experimentally determined peptide masses when screened against a 
fragment database derived from over 50,000 proteins. Experimental 
errors of a few Daltons are tolerated by the scoring algorithms, 
permitting the use of inexpensive time-of-flight mass 
spectrometers. As with other types of physical data, such as amino 
acid composition or linear sequence, peptide masses can clearly 
provide a set of determinants sufficiently unique to identify or 
match unknown sample proteins. Peptide mass fingerprints can prove 
as discriminating as linear peptide sequence, but can be obtained 
in a fraction of the time using less material. In many cases, this 
allows for a rapid identification of a sample protein before 
committing to protein sequence analysis. Fragment masses also 
provide structural information, at the protein level, fully 
complementary to large-scale DNA sequencing or mapping projects 

Ola Myklebost                   Email  ola.myklebost at labmed.uio.no
Dept of Tumor Biology
Inst for Cancer Research        Tel +47-2293-4299
The Norwegian Radium Hospital   Fax +47-2252-2421
N-0310 OSLO, Norway

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net