Does anyone have any recent experience of selecting large numbers of IMAGE
(cDNA) clones that map to specific chromosomal regions? By 'region', I'm
thinking in terms of several cytogenetic bands, rather than more specific
loci. I'd obviously like good coverage of these regions, while minimising
needless duplication and spurious matches, but I guess this will remain a
non-trivial task until gene numbers and positions are more precisely
defined!
Some possible approaches:
1) Select clones based on sequence similarities to known genes in
ENSEMBL, perhaps by BLASTing EST databases with ENSEMBL cDNAs from the
region of interest as queries. Eliminate similar clones by UniGene
clustering..? Or maybe BLAST directly against UniGene seq.uniq FASTA
database & select IMAGE clones from 'positive' clusters?
2) Use radiation hybrid mapping & ePCR data in GeneMap99/MapGene.
3) Use reference intervals and FISH mapping data in UniGene.
4) Use BLAST mapping data in all_est.txt, cross-referenced with cytoband
file from UCSC Golden Path. Again, would have to cluster by some means
to reduce duplication, and set parameters carefully to reduce false
positives.
etc.
I'd much rather do this by setting up local databases that can be queried
with SQL or Perl scripts (as we're already doing with UniGene, GeneMap &
some of the UCSC data), rather than clicking on websites several thousand
times! But all suggestions are welcome.
Richard.