Murphy Choy

Archive for September, 2011|Monthly archive page

Some issues of text analytic

In Uncategorized on September 30, 2011 at 7:09 am

With the rise of text analytic, there has been widespread use in many areas such as marketing, logistic, public relations as well as politics. However, with any widespread use, there are bound to be abuses of text analytic. In today’s topic, we will be discussing about some of these abuses and how these abuses will eventually affect text analytic.

One of the most common use of text analytic is content categorisation.this is often achieved through the use of archives of data which are mainly documents showing similar characteristics or context. Most of the analytic software market about their ability to categorise textual information easily. However, that’s a foolish statement. One of the key flaw of the statement is that it fails to recognise the difficulty in categorising information in a way that fits the context. At the same time, it underplays the importance of corpus. This is a very dangerous game. Fully automated systems are generally not designed to be able to use common sense. The reason behind this is artificial intelligence have improved remarkably over the past decades but they are still not able to use complex learning models that human beings are capable of. This results in systems that are highly dependent on the algorithms built into the system. This may result in deviation from the truth. This is a key reason why customised corpus is more useful than the standard ones.

Another main issue with text analytics is its common association with sentiment analysis. At the same time, most of the users are not aware of some of the issues that plagues sentiment analysis. the reason behind this is the rather simplistic models that has been used to derive sentiments in most analysis. Currently, models used are just basically counts of positive and negative words in any statements. This results in situation where positive and negative emotions cancel each other out. This renders the analyses invalid. a more appropriate approach is to use lexical structures to understand the overall sentiment beneath each statement.

In the next few posts, we will be discussing more about text analytic.