Murphy Choy

To bin or not to bin. A silent debate.

In Uncategorized on June 22, 2011 at 11:00 am

Binning algorithms have been an integral component of credit scorecards and has been examined in great details by many practitioners. The earliest reason to the widespread adoption of the binning approach is the lack of computational resources to achieve complicated scoring. With modern computers being able to work at paces much faster than those in the fifties and sixties, computational resources is no longer as limited as the past. However, binning algorithms are still widely used in credit scorecard process.

One of the key main reasons is the fact that many people identifies logistic regression model as the model that was developed from categorical data analysis. With this fact in mind, most people would have assumed that logistic regression model can only handle categorical variables and any continuous variables would have been problematic. However, this is not the case, as mentioned in the book Applied Logistic Regression, continuous variables can be handled by the model with slight alteration to the method of calculating some of the statistics.

Another key reason with the use of binning is the ability to improve the number of cases in each group. As with chi-square tests, there needs to be a suitable number of cases for modeling and that the binning approaches allows us to get there the fastest. Binning also helps to remove any outliers which might be causing some serious troubles.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: