THIS IS A REPOSTING
Over the past two years, the primary thrust of the GenBank project has
been to improve the timeliness and completeness of the database.
Endeavours such as the interaction with journals, sequence submission
policies, and new submission software tools have brought us to the
point where we now receive 80% of our data in electronic form directly
from the scientific community and where our average turnaround is now
measured in weeks rather than months. This progress in soliciting
direct and automated data submission, and in the RDBMS conversion now
free us to deal in greater detail with one of the most important
components of the database, the biology represented within the
annotation. In addition to our work to enrich the quality of the
annotation using our own annotation resources, we now wish to seek the
direct involvment of the members of the scientific community.
The following announcment represents the beginning of a program to aid
us to enhance the quality and integrity of the data represented in the
GenBank database.
This announcment will only be distributed via e-mail for the pilot
phase, however recipients are free to redistribute this notice. This
notice is being posted to both the GENBANK-BB and BIONEWS bulletin
boards and we apologize in advance for any redundancy across the two
newsgroups.
Paul Gilna
GenBank Biology Domain Leader
Los Alamos National Laboratory
Los Alamos, NM 87545
pgil%histone at lanl.gov
Tel: (505) 665-2177
Fax: (505) 665-3493
GENBANK CURATOR PROGRAM
GenBank announces the pilot phase of the GenBank Curator
Program. We are seeking suggestions for work to be done on the
database in the form of informal proposals. Authors of
successful proposals will travel to Los Alamos and work with
the annotation or computation staff to carry out their proposed
project.
Although GenBank has had some curators in the past, the advent
of the GenBank RDBMS restructuring and its attendant interface,
the Annotator's Workbench, allows us to implement an expanded
program using a unified, intuitive annotation tool that
provides the capability of remote use.
The current program seeks to identify domains within the
database that are in need of overhaul either at the sequence or
at the annotation level. In addition, as part of ongoing
development of the Sequence Validation Suite (SVS), a suite of
software programs that will be used to check the validity of
submitted sequence and annotation data, we have expanded the
program to include software development associated with the
SVS.
We are looking to the readership of the molecular
biology-oriented Bulletin Boards for proposals for curation on
GenBank; if you are familiar with a domain or family of
sequences represented within the database and with the existing
annotation, and have some ideas on how the annotation could be
improved (for example to reflect similarities in features
across entries, to improve existing nomenclature, or to point
out sequence merges), or on software that could be developed to
aid data integrity and validation, then we would like to hear
from you.
In this pilot study, about six proposals will be selected to be
implemented before the end of September, 1990. Based on the
results of the study, we hope to take on about 30 or so more
projects over the course of the next two years. The capability
exists for continued interaction with the data bank staff on a
consultant basis, using remote access facilities to the
annotation software. The work will be carried out on site at
Los Alamos. Travel (within the US for the pilot study), hotel
costs, and subsistence will be covered. Project proposals will
be reviewed by GenBank and NIH staff. Proposals should be
submitted to Dr. Paul Gilna via e-mail (pgil%histone at lanl.gov)
and should cover the following topics:
o Detailled description of work proposed, citing examples from
the database, where relevant, and of the scope of the
proposed work
o Justification of work in terms of benefit to community
and data bank
o Estimation of time needed to conduct work at LANL
o Abbreviated CV including representative publications.