Oct 09 01:03

K-means & correlation distance

Here's a useful observation related to the use of K-means together with the Pearson correlation distance (© Alex).

The standard K-means update step, where you update the cluster centers by taking the means of the corresponding points is technically not very appropriate in the case of the correlation distance d(x, y) = 1 - corr(x, y). The proper step would be to take the sum of the normalized points as the new cluster center:

c = sumi(xi/|xi|)

Oct 08 20:37

Float precision

''Float precision'' is a very subtle issue. It creeps up so rarely that many people (me included) would get it out of their heads completely before it would show itself somewhere once again.
Indeed, most of the time it's not a problem at all, that floating point computations are not ideally precise, and noone cares about the small additive noise that it produces, as long as you remember to avoid exact comparisons between floats.

Oct 10 21:52

Scilab + PVM

I've recently found out that you can use PVM within Scilab to parallelize programs easily. I liked the experience so much that I couldn't resist sharing it with you.