Murphy Choy

A slight detraction: Back to cluster analysis

In Uncategorized on June 7, 2011 at 8:04 am

Just received a phone call about cluster analysis and how to go about conducting it. To my horror, another case of wrongful use of cluster analysis without properly understanding of objective of the project. To add salt to the wound, the project just basically involves dumping a whole bunch of variables with or without relation to the problem. What a horrible approach?

Does it mean that the analyst is at fault? Nope. Hardly so. The main problem is the project does not have a proper scope of work. The entire project is just to explore the data. Well such data exploration without something concrete is just the recipe for misleading result.

Sometimes, proper directions will be very useful. After all, sometimes more information may not be as useful as less information. The huge amount of information nowadays makes it difficult for anyone to decide the most appropriate or relevant information.

A proper cluster analysis is not just about dumping half a million variables and 3 million observations. It is about deriving useful information and doing cluster analysis on them to extract the most important information.


