Murphy Choy

Understanding the business, not just random modeling

In Uncategorized on May 29, 2011 at 10:46 am

I was involved in a small discussion with my friends about the recent data mining competitions which are pretty out of the world kind of modeling. While it is very interesting for researchers to try out different techniques, it is totally a waste of effort and something that might be detrimental to reputation of the data miners and analysts.

One of the main issue in data mining is to ensure that one can understand the final outcome of the result in the context of the business. The power of neural network in non-linear modeling has been widely recognized. The lack of neural network modeling in many areas is commonly attributed to the difficulty of understanding the model which makes it difficult to relate the different layers of ideas together. The inability to relate business and models causes a lot of trouble for BI wannabes.

The other end of the problem is that we can have modeling that attempts to explain something which obviously cannot be explained by those very factors. Such modeling is just basically nonsense and extremely tenous in establishing the truth. This is further compounded by the massive datasets making all models statistically significant but not necessarily useful.

Hopefully, when we do modeling, more people will be able to use Occam razor before dredging data.


