In article <1994Nov12.071107.19651 at comp.bioz.unibas.ch>, Reinhard Doelz writes:
>>As there is a considerable amount of yeast sequences already published
>in the sequence database (with functions etc. assigned), what would be
>a more customer-oriented guess on what fraction of sequences is available
>to the community in the EMBL database?
In article <3a2uas$42r at nntp.Stanford.EDU>, Mike Cherry writes:
>>We estimate that 52% of the S. cerevisiae genome is present in
>GenBank/EMBL. This estimate includes some analysis that we have done
>on yeast sequences in GenBank. We built a non-redundant set of
>sequences, or a consensus sequence contig, out of the GenBank
>sequences. When the consensus sequences are combined with the genomic
>sequencing results we obtain a number of 6.4Mbp. The amount of
>publicly available sequences expected from the genomic sequencing
>projects should increase by 4Mbp in 1995. Thus it is advised that you
>regularly search the GenBank/EMBL on the chance that your region of
>interest has been made public.
I recently carried out a survey for the EC in which the amount
of database redundancy was estimated. In release 39 of EMBL it
was estimated that there was 34.8% redundancy for Saccharomyces
cerevisiae in FUN.DAT. If we apply this value to release 40
(12th September 1994) of the EMBL database with 9.8Mb of sequence
we also get a figure of 6.4Mb in agreement of the above figure.
Tomas Flores PhD Tel:+44-(0)1223 494414
The EBI Data Library Fax:+44-(0)1223 494400
Hinxton Hall Email:flores at ebi.ac.uk
Cambridge CB10 1RQ
UK "FLAMES >> /dev/null"