Know the immeasurable in data mining

Data mining has several methods that analyze the data and come to conclusions on the data. The tools are a mixture of statistics, mathematics and reasoning. The tools are very useful and they help in come to conclusions very well. So if you are in the business of selling a commodity to the youth through your website, then the best time to put your advertisement on the search engine is probably between 3pm and 9 pm during the week. That is the only time most of the youth will use the computer to search for things when they return back from school. This seems obvious but the conclusions have been drawn by looking at large datasets that have seen the shopping behavior at different times of the day for individuals and then computing which would be the best time for the ad to be presented. Companies do the analytics for a large data set and conclude upon the buying habits of shoppers whether online or in the shopping store.

However, there are several things that are not measurable very well and they do influence the marketplace. There are several things that are not measurable but influence the behavior or shoppers. Or more specifically in biology, there are several things that are not measurable or too complex but do influence the conclusions.

It is important to understand the things that are being measured while paying attention to the things that are not being measured. For example, how does the lighting in the store appear on a nice bright day vs. on a cloudy day. This is not something that is measured but it does influence the shopping behavior. Or consider how the checkout clerks at the counters conclude that not many people will show up on a particularly rainy or snowy day. The clerks know this but this is probably not captured in the data mining of the sales data that is measured with a given day, time and other factors back at the head-office.

To quote from Youngme Moon’s book “Different” : “If we only pay attention to things that we can measure, we will only pay attention to the things that are easily measurable”.

As data miners, we have to pay attention to the things that are mined but pay special attention to the unknowns that influence the data. Without doing that, a data-miner will miss much.


Posted

in

by

Tags: