Hi Austin,
Thanks again for replying. The k-means algorithm should be a snap. But
how do I convert the proteins, which are in the format
"UPSP_SLDJK_HUMAN_P12182" to vectors that can be handled by the
mathematical algorithm (i.e. what is the "distance" between two
proteins)? Is there already a program that does this? (I understand
there's something on the NCBI's website.)
It seems there is no before-and-after. This is a simple measurement of
proteins in people with a disease and without a disease. I do not know
which is which; I simply have a list of proteins.
Mathematical background: graph theory, combinatorics, probability,
theory of computation, linear algebra, multivariable calculus,
differential equations. Also a lot of programming.
I can't tell you how much I appreciate your help. My friend is swamped
and trying to make a deadline. I'm trying to give him a hand but am not
up to snuff it seems :)
Best,
Rex
Austin P. So (Hae-Jin) wrote:
> Hi Rex...
>> If it is a matter of "relationship" (in the sense that a set of proteins
> behave i.e. go up or down in the same way over a given treatment), then
> any k-means algorithm will do the trick. I'm pretty sure this would be a
> no-brainer for you. I'm actually surprised they don't give it a go by
> themselves...
>> Anything else (like your second point...though I'm curious about the
> experimental design) you'd have to give more details, say:
>> col1=proteinID
> col2=expt1
> col3=expt2
> etc...
>> BTW...If you don't want to do it, then don't. Technically, if the person
> is doing a PhD, then he/she should know how to do it themselves anyway
> and not just hand it off...
>> Austin
>>> P.S. what is your math background? I should have probably asked that
> first...