IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Splice Site, Codon and Amino Acid Usage in Arabidopsis

Mike Cherry CHERRY at FRODO.MGH.HARVARD.EDU
Mon Sep 13 15:00:56 EST 1993


Below is an update of the splice site concensus, codon usage and amino
acid tables for Arabidopsis. These tables were produced by the ACEDB
version 1-10 software using the Arabidopsis DNA sequences contained
within AAtDB release 1-5. AAtDB 1-5 will be release later this week.

Table of Splice Site Concensus Sites

  5' concensus

                      |  --- intron --->
A  160  216  345   63 |   6    9  387  322  130  144  194
C  148  185   86   22 |   3    4   22   88   49   89   94
G   95   92   54  433 | 550    9   62   29  281   60   65
T  166   76   84   51 |  10  547   98  130  109  276  216


  3' concensus

                --- intron --->      |
A  108  115   93  186   35  544   14 | 149  125  159  154
C   78   76   68   33  345    7    3 |  63   82   77  101
G  110   93   51  208   16   12  549 | 290  107  142  173
T  273  285  357  142  173    6    3 |  67  255  191  141


Splice Site Table Legend

Splice site consensus table produced using the ACEDB 1-10 software on
the 2494 DNA sequences contained within AAtDB 1-5. A total of 569
introns were analysed. The table simply provides a tally of the
nucleotides that appear at the exon-intron and intron-exon boundaries.
The ACEDB software is not predicting the location of the splice site,
this information is taken from the GenBank features table information
submitted by the original authors of the sequence.


Arabidopsis Codon Usage Table


             U                C                A                G

      UUU  Phe  44.0   UCU  Ser  27.4   UAU  Tyr  41.1   UGU  Cys  51.5
      UUC  Phe  55.9   UCC  Ser  14.2   UAC  Tyr  58.8   UGC  Cys  48.4
   U
      UUA  Leu  10.5   UCA  Ser  18.3   UAA  ***  34.5   UGA  ***  45.5
      UUG  Leu  22.6   UCG  Ser  10.0   UAG  ***  19.8   UGG  Trp 100.0


      CUU  Leu  26.8   CCU  Pro  35.8   CAU  His  52.1   CGU  Arg  19.3
      CUC  Leu  19.5   CCC  Pro  13.1   CAC  His  47.8   CGC  Arg   6.9
   C
      CUA  Leu   9.6   CCA  Pro  34.9   CAA  Gln  50.1   CGA  Arg   9.5
      CUG  Leu  10.8   CCG  Pro  16.1   CAG  Gln  49.8   CGG  Arg   7.9


      AUU  Ile  40.6   ACU  Thr  35.0   AAU  Asn  43.2   AGU  Ser  15.2
      AUC  Ile  41.2   ACC  Thr  25.2   AAC  Asn  56.7   AGC  Ser  14.6
   A
      AUA  Ile  18.1   ACA  Thr  26.2   AAA  Lys  41.8   AGA  Arg  31.9
      AUG  Met 100.0   ACG  Thr  13.4   AAG  Lys  58.1   AGG  Arg  24.2


      GUU  Val  39.5   GCU  Ala  46.1   GAU  Asp  62.3   GGU  Gly  35.7
      GUC  Val  22.5   GCC  Ala  18.8   GAC  Asp  37.6   GGC  Gly  13.4
   G
      GUA  Val  11.4   GCA  Ala  22.7   GAA  Glu  46.1   GGA  Gly  38.4
      GUG  Val  26.4   GCG  Ala  12.2   GAG  Glu  53.8   GGG  Gly  12.3


Amino Acid Usage Table

 Hydrophilic 50.6

    Basic 13.8
        Lys  6.4  Arg  5.3  His  2.1

    Acidic 12.0
        Asp  5.4  Glu  6.6

    Neutral 24.8
        Asn  3.9  Gln  3.5  Cys  1.6  Met  2.6  Ser  7.7  Thr  5.4

 Hydrophobic 49.0

    Aliphatic 40.8
        Gly  8.0  Ala  7.3  Val  6.7  Pro  4.8  Leu  8.7  Ile  5.3

    Aromatic  8.1
        Phe  4.1  Tyr  2.8  Trp  1.2


Codon and Amino Acid Usage Table Legend

Codon usage and amino acid usage as determined by ACEDB 1-10 using the
Arabidopsis sequences contained within AAtDB 1-5. The ACEDB software
is not predicting the location of the exons for use in determining the
codons used rather this information is taken from the GenBank features
table information submitted by the original authors of the sequence.
A total of 515 coding sequences were analysed containing 166899 codons
with 737 stops, 11 ambiguous codons, and 49 uncomplete codons.



More information about the Arab-gen mailing list

Send comments to us at biosci-help [At] net.bio.net