Dear Sequin users,
We have recently released a new version of Sequin, the sequence
submission/editing tool from NCBI, for all platforms.
The current version of Sequin is now 2.70
Please refer to the Sequin home page at:
http://www.ncbi.nlm.nih.gov/Sequin/
for the latest developments, new questions in the Frequently Asked
Questions section, and the most recent version of the help
documentation.
Major changes for Sequin version 2.70
-----------------------------------------
Both of the major changes in this Sequin version will be useful for
genome centers annotating large records.
. This version is capable of editing complete bacterial chromosomes
or large eukaryotic chromosomal segments in a single record. Because
the generation of reports (i.e., GenBank and Graphic view) and
validation are both much faster, chromosomes no longer have to be split
up into separate overlapping records.
. Sequin can now annotate features by reading in a tab-delimited
table. The table specifies the location and type of feature, and
Sequin processes the feature intervals and translates any CDSs. The
table is read in the record viewer (after the sequence has been
imported) using the File-->Open menu. The table must follow a defined
format. The first line starts with >Feature, a space, and then the
Sequence ID of the sequence you are annotating. In the example below,
eIF4E is the Sequence ID. The table is composed of five columns:
start, stop, feature key, qualifier key, and qualifier value. The
columns are separated by tabs. The first row has start, stop, and
feature key. Additional feature intervals just have start and stop.
The qualifiers follow on lines starting with three tabs.
For example, a table which looks like this:
>Features eIF4E
80 2881 gene
gene eIF4E
201 224 CDS
1550 1920
1986 2085
2317 2404
2466 2629
product eukaryotic initiation factor 4E-II
1402 1458 CDS
1550 1920
1986 2085
2317 2404
2466 2629
product eukaryotic initiation factor 4E-I
note encoded by two messenger RNAs
80 224 mRNA
1550 1920
1986 2085
2317 2404
2466 2881
product eukaryotic initiation factor 4E-II
80 224 mRNA
892 1458
1550 1920
1986 2085
2317 2404
2466 2881
product eukaryotic initiation factor 4E-I
80 224 mRNA
1129 1458
1550 1920
1986 2085
2317 2404
2466 2881
product eukaryotic initiation factor 4E-I
will result in a GenBank flatfile which contains this:
mRNA join(80..224,1129..1458,1550..1920,1986..2085,2317..2404,
2466..2881)
/gene="eIF4E"
/product="eukaryotic initiation factor 4E-I"
mRNA join(80..224,892..1458,1550..1920,1986..2085,2317..2404,
2466..2881)
/gene="eIF4E"
/product="eukaryotic initiation factor 4E-I"
mRNA join(80..224,1550..1920,1986..2085,2317..2404,2466..2881)
/gene="eIF4E"
/product="eukaryotic initiation factor 4E-II"
gene 80..2881
/gene="eIF4E"
CDS join(201..224,1550..1920,1986..2085,2317..2404,2466..2629)
/gene="eIF4E"
/codon_start=1
/product="eukaryotic initiation factor 4E-II"
/translation="MVVLETEKTSAPSTEQGRPEPPTSAAAPAEAKDVKPKEDPQETG
EPAGNTATTTAPAGDDAVRTEHLYKHPLMNVWTLWYLENDRSKSWEDMQNEITSFDTV
EDFWSLYNHIKPPSEIKLGSDYSLFKKNIRPMWEDAANKQGGRWVITLNKSSKTDLDN
LWLDVLLCLIGEAFDHSDQICGAVINIRGKSNKISIWTADGNNEEAALEIGHKLRDAL
RLGRNNSLQYQLHKDTMVKQGSNVKSIYTL"
CDS join(1402..1458,1550..1920,1986..2085,2317..2404,
2466..2629)
/gene="eIF4E"
/note="encoded by two messenger RNAs"
/codon_start=1
/product="eukaryotic initiation factor 4E-I"
/translation="MQSDFHRMKNFANPKSMFKTSAPSTEQGRPEPPTSAAAPAEAKD
VKPKEDPQETGEPAGNTATTTAPAGDDAVRTEHLYKHPLMNVWTLWYLENDRSKSWED
MQNEITSFDTVEDFWSLYNHIKPPSEIKLGSDYSLFKKNIRPMWEDAANKQGGRWVIT
LNKSSKTDLDNLWLDVLLCLIGEAFDHSDQICGAVINIRGKSNKISIWTADGNNEEAA
LEIGHKLRDALRLGRNNSLQYQLHKDTMVKQGSNVKSIYTL"
Note that if the gene feature spans the intervals of the CDS and mRNA
features for that gene, you don't need to include gene "qualifiers" in
those features, since they will be picked up by overlap.
Features which are on the complementary strand are indicated by reversing
the interval locations. For example, the table:
>Features dna2
2710 2639 tRNA
note codon recognized: GAA
product tRNA-Glu
anticodon (pos:2675..2677, aa:Glu)
will result in a GenBank flatfile containing:
tRNA complement(2639..2710)
/note="codon recognized: GAA"
/product="tRNA-Glu"
/anticodon=(pos:2675..2677, aa:Glu)
If the formatting of these tables is not reproduced correctly in your email,
you can also view them at:
http://www.ncbi.nlm.nih.gov/Sequin/log.html
regards to all,
francis, for the sequin development team.
--
| B.F. Francis Ouellette
||francis at ncbi.nlm.nih.gov
New Address: francis at cmmt.ubc.ca