We are trying to locate protein sequence data sets derived from PIR, Swiss-Prot,
etc., from which all obviously homologous sequences have been removed. That
is, only one representative cytochrome, hemoglobin, etc. is included. Data
sets for which less obviously redundant sequences have been eliminated would
be even better.
We are planning to use such a dataset as the basis for searching for unique
conserved polypeptide motifs. Any help would be greatly appreciated.
We do not look forward to the task of generating this set from scratch.
Thank you,
Ken Fasman
Genome Data Base
Welch Medical Library
The Johns Hopkins School of Medicine
1830 E. Monument St. 3rd Floor
Baltimore MD 21205
(410) 955-9705
ken at welchgate.welch.jhu.edu