In article <1991Jul1.185311.8785 at jax.org> mrk at jax.org (Michael Kosowsky) writes:
>>How do GENBANK and NCBI's GENINFO symbolize uncertain base pairs?
>>I've so far learned of three incompatible systems.
>For example, to represent "A or G", Microgenie
>uses 'P', REBASE use 'R', and DNA Inspector uses something
>>I naively hope to get away with implementing just one.
There is a well established international standard for representing
ambiguities, adopted by the Nomenclature Committee of the International
Union of Biochemistry (Cornish-Bowden, A., Nucl. Acids Res. 13, 3021-3030
(1985). The symbols are used as follows:
Symbol Meaning | Symbol Meaning
G Guanine | K G or T
A Adenine | S G or C
C Cytosine | W A or T
T Thymine | H A or C or T
U Uracil | B G or T or C
R Purine (A or G) | V G or C or A
Y Pyrimidine (C or T) | D G or T or A
M A or C | N G or A or T or C
This standard is followed by GenBank, but I would assume that NCBI does so
The Microgenie use of P for purine probably goes back to its original
incarnation in the dim past as the 'Korn/Queen' program, before the standard
was adopted. However, that's no excuse for not keeping up with the times.
Fortunately for you, there is now a well-accepted standard. If you want to
do it right, stick with the standard. As for those software manufacturers
who wish to complicate things by refusing to make the trivial changes
necessary to comply with internationally agreed-upon standards, well,
that's their problem.
Brian Fristensky |
Department of Plant Science | Freedom begins when you tell Mrs. Grundy
University of Manitoba | to go fly a kite.
Winnipeg, MB R3T 2N2 CANADA |
frist at ccu.umanitoba.ca |
Office phone: 204-474-6085 | - Robert A. Heinlein
FAX: 204-275-5128 |