You might want to watch out for the other IUPAC nucleotide symbols
(especially N).
See http://www.chem.qmw.ac.uk/iubmb/misc/naseq.html for a full list.
The (ahem) C snippet below will handle these cases when
generating a reverse complement of a fasta format file.
HTH,
Guy.
/* START */
#include <stdio.h>
char *comp =
"----------\n--------------------------------------------------"
"----TVGH--CD--M-KN---YSAABW-R-------tvgh--cd--m-kn---ysaabw-r-";
int rev(int ch){
if((ch == '>') || (ch == EOF))
return ungetc(ch, stdin);
rev(getchar());
putchar(comp[ch]);
return ch;
}
int main(){
int ch;
do { while(((ch = getchar()) != '\n') && (ch != EOF))
putchar(ch);
} while(rev(ch) != EOF);
return 0;
}
/* STOP */
--
----------------------------------------------------------------------
Guy St.C. Slater, Tel: (+44) 1223 49 45 65
Human Genome Mapping Project Resource Centre, Fax: (+44) 1223 49 45 12
Wellcome Trust Genome Campus, mailto:gslater at hgmp.mrc.ac.uk
Hinxton, Cambridge, CB10 1SB. http://www.hgmp.mrc.ac.uk/~gslater/
----------------------------------------------------------------------