IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

[Computational-biology] Genedoc

Ivan Erill via comp-bio%40net.bio.net (by erill from umbc.edu)
Fri May 10 12:28:40 EST 2013


Many bioinformatics programs (e.g. CLUSTALW) will take the first 10 digits
of a FASTA line as relevant for naming.
So
>379026087 AP012340:177550-178025 (+)
and
>379026087 AP012340:179382-179864 (+)
are actually the same name as far as the program is concerned.

Ivan Erill


On Fri, May 10, 2013 at 7:57 AM, Tijesunimi Odebode <tijesunimi from yahoo.com>wrote:

> Good morning,
>
> I am a graduate student and a first time user of genedoc. I tried
> importing a .mfa (multifasta) alignment file using genedoc, but I got an
> error message saying "duplicate sequence name found." How do I fix this
> problem? I will really appreciate any help. Thanks in advance. Here is some
> part of the file:
>
> TGGATCTGCGGGAGCGTGAGCGGTTGCGCGGACAGCGCCTGCAGGTCGGCCGTGAAGTCA
> AGGATGCCTTCTTGTGTTCCGGCGGCCAGGGCATCGGCGATGACCTGAGGCGGCACGTTC
> GGCCACAGCCCGAACGGCGTTCGCACATCGGCGTAGCTCGTCGAGTAGCCGTAGTTCGGG
> TCGCCGTAGCCCAGGTTGACGATCACC
> = score = 59667  type = DM  L1 = 4392353  L2 = 4345492  AL1 = 59664  AL2 =
> 59636  P_ID = 99.89
> >379026087 AP012340:177550-178025 (+)
> TGTGCCGGTGTGAGGTCCGCATACGTGGTGTGCACCGTGAGTATCCCGAATACTGCGTTG
> ATATCGGACAGGACATTGAGTGGATACCGCGGGAAGTCGGCGAAACCGTCGTACTCGAGG
> GTGTAGGTCGTCGTCGGATAGGGATTGTCCGGGG---TCGCCCCGTAGAACGGTAGGCCG
> AGGGTGGTGACATTCAGACCGGGTATGCGCGCAAGTATCCCGCCATTGGGATTCATCTCG
> TTGCCGATCAAGATGAAATTGAGCTGGCTGGGGCTGGGAGCGTTGGGACCCAGCGAGATG
> AGGTGCTGCATTTCCAGGGACGCGATGACGGCGCTCTGCGAATAGCCGAACACGGTGACG
> TGGTTTCCGGCGTTGATTTGCTC--CCA-AATCGCGCCGTCGAGAATCTGTAGGCCCAAC
> TGCACCGAGGTTTGGAAGGGCAGGGATTTGACGCCGGTGATCGGATATAGCTCTTCGGGC
> GT
> >31742509 BX248333:180260-180741 (+)
> TGCGCGGGCGTGAGGTCCAAATACTTGGTGTGTACGAATGTGATGCCTGCAACCGCGTTG
> AGGTCGGAAATGAAGTTGAGCGGGTATCGCGAGAAGTCGGCGAACCCGTCGTACTCGAGC
> GTGTAGATGGCCGTCGGATAGATCGTGTCCGAGGGCGTTGCGCCATAGAACGTCAGGTCC
> AGAGTCGGAAGCGTCAGATCCGGGAACCGCGCGAGCATACCGCCATTGGGGTTCATTTCA
> TTGCCGACAAGCACGAAATTGAGGTCGCTCGCCGAAGGTGCGGCCCCGCCCATCGCCGTG
> AACCTCTGCATCTCCAGCGACGCGATTATGGCGCTTTGCGACCAGCCGAAAACGGTGACC
> GCGTTTCCGGTGGTCGCGAGCTCTACCATGATCGCGTCGTGCAAGATGGTCAAGCCCTCT
> TCCACTGACGTGTTGAGGACCAAACTTCTGACACCGGTGAGTGGGTACAACTCTTCGGGT
> GT
> = score = 482  type = M2  L1 = 4392353  L2 = 4345492  AL1 = 476  AL2 = 482
>  P_ID = 69.25
> >379026087 AP012340:179382-179864 (+)
> CTGCGCGGGCGTGAGGTCCAAATACTTGGTGTGTACGAATGTGATGCCTGCAACCGCGTT
> GAGGTCGGAAATGAAGTTGAGCGGGTATCGCGAGAAGTCGGCGAACCCGTCGTACTCGAG
> CGTGTAGATGGCCGTCGGATAGATCGTGTCCGAGGGCGTTGCGCCATAGAACGTCAGGTC
> CAGAGTCGGAAGCGTCAGATCCGGGAACCGCGCGAGCATACCGCCATTGGGGTTCATTTC
> ATTGCCGACAAGCACGAAATTGAGGTCGCTCGCCGAAGGTGCGGCCCCGCCCATCGCCGT
> GAACCTCTGCATCTCCAGCGACGCGATTATGGCGCTTTGCGACCAGCCGAAAACGGTGAC
> CGCGTTTCCGGTGGTCGCGAGCTCTACCATGATCGCGTCGTGCAAGATGGTCAAGCCCTC
> TTCCACTGACGTGTTGAGGACCAAACTTCTGACACCGGTGAGTGGGTACAACTCTTCGGG
> TGT
> >31742509 BX248333:178426-178902 (+)
> CTGTGCCGGTGTGAGGTCCGCATACGTGGTGTGCACCGTGAGTATCCCGAATACTGCGTT
> GATATCGGACAGGACATTGAGTGGATACCGCGGGAAGTCGGCGAAACCGTCGTACTCGAG
> GGTGTAGGTCGTCGTCGGATAGGGATTGTCCGGGG---TCGCCCCGTAGAACGGTAGGCC
> GAGGGTGGTGACATTCAGACCGGGTATGCGCGCAAGTATCCCGCCATTGGGATTCATCTC
> GTTGCCGATCAAGATGAAATTGAGCTGGCTGGGGCTGGGAGCGTTGGGACCCAGCGAGAT
> GAGGTGCTGCATTTCCAGGGACGCGATGACGGCGCTCTGCGAATAGCCGAACACGGTGAC
>
>
> Tijesunimi Odebode
> _______________________________________________
> Comp-bio mailing list
> Comp-bio from net.bio.net
> http://www.bio.net/biomail/listinfo/comp-bio
>


More information about the Comp-bio mailing list

Send comments to us at biosci-help [At] net.bio.net