The BLAST low-complexity filters seg and xnu change amino acids in the
Query sequence into X characters. When these sequences are searched
(BLASTP) against a database, the Query sequence no longer hits itself with
100% identity because matches involving X are counted as mismatches. Is
there a way of overcoming this so that filtered sequences still have 100%
identity to themselves?
For example: yeast TUP1 after seg-ing hits itself with only 81% identity:
SWISS|TUP1_YEAST|P16649
Length = 713
Score = 2930 (1318.2 bits), Expect = 0.0, P = 0.0
Identities = 581/713 (81%), Positives = 581/713 (81%)
Query: 1 MTASVSNTQNKLNELLDAIRQEFLQVSQEANTYRLQNQKDYDFKMNQQLAEMQQIRNTVY 60
MTASVSNTQNKLNELLDAIRQEFLQVSQEANTYRLQNQKDYDFKMNQQLAEMQQIRNTVY
Sbjct: 1 MTASVSNTQNKLNELLDAIRQEFLQVSQEANTYRLQNQKDYDFKMNQQLAEMQQIRNTVY 60
Query: 61 ELELTHRKMKDAYEEEIKHLKLGLEQRDHQIXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 120
ELELTHRKMKDAYEEEIKHLKLGLEQRDHQI
Sbjct: 61 ELELTHRKMKDAYEEEIKHLKLGLEQRDHQIASLTVQQQRQQQQQQQVQQHLQQQQQQLA 120
Query: 121 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXFPVQASRPNLVGSQLPTTTLPVVSSNA 180
FPVQASRPNLVGSQLPTTTLPVVSSNA
Sbjct: 121 AASASVPVAQQPPATTSATATPAANTTTGSPSAFPVQASRPNLVGSQLPTTTLPVVSSNA 180
etc. etc.
Any ideas?
--
Ken Wolfe
Department of Genetics
University of Dublin e-mail: khwolfe at tcd.ie
Trinity College phone: +353-1-608-1253
Dublin 2, Ireland FAX: +353-1-679-8558