Murphy Choy

Archive for June, 2011|Monthly archive page

Social Media and Analytics: Financial sector 3

In Uncategorized on June 30, 2011 at 11:00 am

How social media can be used to monitor market risk?

Social Media and Analytics: Financial sector 2

In Uncategorized on June 29, 2011 at 11:00 am

This audio blog discusses the application of social media and analytics for credit risk management.

Social Media, Marketing Analytic and Customers!

In Uncategorized on June 28, 2011 at 11:00 am

This is a simple audio blog for marketing analytic and how social media acts as an additional platform to feed information to marketing analytic.

Social Media, you and analytic!

In Uncategorized on June 27, 2011 at 11:00 am

This audio blog discusses some basics in social media and how it can relate to your company as well as the your life in general!

Barber and Surgeon? The differentiation.

In Uncategorized on June 24, 2011 at 11:00 am

Something interesting in the forums.

Certification is the hall mark of an expert? No!

In Uncategorized on June 23, 2011 at 11:00 am

This is my first audio blog. Hope you will like it.

To bin or not to bin. A silent debate.

In Uncategorized on June 22, 2011 at 11:00 am

Binning algorithms have been an integral component of credit scorecards and has been examined in great details by many practitioners. The earliest reason to the widespread adoption of the binning approach is the lack of computational resources to achieve complicated scoring. With modern computers being able to work at paces much faster than those in the fifties and sixties, computational resources is no longer as limited as the past. However, binning algorithms are still widely used in credit scorecard process.

One of the key main reasons is the fact that many people identifies logistic regression model as the model that was developed from categorical data analysis. With this fact in mind, most people would have assumed that logistic regression model can only handle categorical variables and any continuous variables would have been problematic. However, this is not the case, as mentioned in the book Applied Logistic Regression, continuous variables can be handled by the model with slight alteration to the method of calculating some of the statistics.

Another key reason with the use of binning is the ability to improve the number of cases in each group. As with chi-square tests, there needs to be a suitable number of cases for modeling and that the binning approaches allows us to get there the fastest. Binning also helps to remove any outliers which might be causing some serious troubles.

Stepwise selection in regression? A poor choice

In Uncategorized on June 21, 2011 at 11:00 am

Stepwise selection is almost the most common model selection technique in credit score building. It is also by far the most common approach taught in college for model building. However, this technique is not flawless and has serious drawbacks which demand a closer examination from the analysts.

The classic model selection techniques are all compromised by a flawed comparison process which is akin to the problem faced in multiple comparison tests. Now without me getting into all sort of tough stuffs on this light hearted blog. I will suggest everyone to pay a visit to one of the SAS papers online which detailed much of the atrocities of the selection technique.

Have a look at David and Peter’s paper.

Another random rant from a SASer!

Fundamentals to model building? Think about it

In Uncategorized on June 20, 2011 at 11:00 am

A weekend spent SASing away usually yields better results on the blog post. I was entertained by a particular posting online about the fundamental of model building. The forum discussion was about a list of criteria that will guide any beginners about the steps to model building. Let us take a look at them (extracted from the site with editing).

  • Judgment and subjectivity should be minimized wherever possible.
    Models should be supported by enough detailed documentation such that they could be reconstructed by an independent third party and yield identical results.
  • Do not introduce elements to a model simply because they drive results that are more in line with the expectations of the business.
  • Models should be developed by qualified personnel with certified expertise in advanced statistical techniques and under the close supervision of a veteran modeling manager.
  • The model design should be firmly grounded in a generally accepted statistical theory.

These are a bunch of very interesting observations. Most are extremely mathematical and seems to satisfy most stringent academic criteria. However, they seem overly restrictive and may beĀ  counter productive.

The first criteria is very dangerous. Whatever the data tells you must make sense, if it does not, you will need to make a judgement call about the viability of the model. If not, you might want to reconsider the data that has been collected. Subjectivity is always present. Why do you use regression model and not decision tree? That is a subjective call. Perhaps you might use performance measure, but why not the neural network over regression? Too tough to understand? Again, another subjective call. As the modeling process goes on and on, it becomes apparent that subjectivity will creep into the model.

Do not introduce elements that are more in line with business? No problem. The model does not reflect the reality! Well we have to always introduce elements to make the model representative of the business model and understand how external environment influence the business. We should be careful of introducing elements that are not in line with business but the whims and wiles of the business users.

Well a good choice for point three. But can you define an expert? I believe this is very difficult as there are just too many experts out there. Perhaps, one should look at the work one has done. How many models make one an expert? The more the merrier. Certified people may not always be good although certification with plenty of experience will most likely be it.

The last one is a given. Who are we to deny it?

The above opinions are just random rants from a SASer!

Abuses of analytic: Poor Soul, Poor Analytic

In Uncategorized on June 17, 2011 at 2:18 am

Very often, most people working in the field seem to see analytic as a panacea to every problems out there. However, this is definitely not the case and it is almost interesting that the worst abuses are at the places where it should work the best.

A recent discussion with some students have turned up some very disturbing trend of analytic usage. One poor soul suggested that we should use analytic forecasting methods to predict where the opposition party members are and seek to eliminate them. I am sure that the leading party will agree that this is a good strategy but one that will ultimately backfire on them.

Analytic has some degree of uncertainty which is associated with the model which is being used to model the reality. This level of uncertainty is naturally controlled in non human interference environment. However, human do react to stimulus and any models which predicts their behavior and in a certain way provoke those behaviors will have artificially created a reaction.

This effect is most often seen in marketing campaign where the response rate is very low even though the predicted rates are supposed to be much higher. Thus proper application of analytic is very important. I certainly hope this poor soul will not abuse analytic leading to unnecessary polarization of the society through the naive use of a wonderful model.