Does anybody know of a collection of Protein secondary structure data,
perhaps obtained from DSSP, that are collected in a format resembling a
sequence database (and perhaps can e searched by sequence-comparison or
database-searching programs). The original DSSP files are a little lengthy.
I think of something like:
Example.dat length=40 check=murks
1 CCCHHHHHHH HHTTTSSSSS SSSSSSSSSS SSSSSCCCCC
where C means random coil
H means alpha helix
S means beta sheet and
T means beta turn or somthing similar.
perhaps it would be useful to encode also if the residues are exposed or
withi the protein core.
It should be fairly easy to generate this from a collection of the original
DSSP datafiles and i had done it myself if we had the chance to have the
DSSP files on disk. (I know that DSSP files are available from NETSERV at EMBL)