Big Data is not always right – Fooled by analytics

Byscientist July 3, 2013

It is easy to get fooled by how the statistics interpret data. Sometimes, analysis of big data sets lead to conclusions that may not make sense. Also, the cause and effect do not work quite the same when the big data analysis shows a correlation. Just because there is a correlation does not mean that there is a cause and effect. Take the example of Kaggle… they ran a contest in 2012 on the quality of used cars and the characteristics of those cars. A used car dealer supplied the data to predict which cars were likely to have problems, their characteristics and what were the other cars that were not so likely to have problems. A correlation analysis showed that cars painted orange were far less prone to have defects – about half the rate of other cars. What has the car color got to do with problems? Color has no correlation and rightly so – this was just the chance event that was pulled out. But once, such a correlation between the car defects and color had been found out, the conclusions that can be drawn tends to get ridiculous. • Paint your car orange to have fewer defects. • Buy a orange car and your car will last longer, no matter how you treat it and forget about the oil change. • If you have an orange car, then you do not need to maintain the car. However, these conclusions get more complicated the more you use them. Even with the most complicated analysis, it is important to think about reason rather than believe everything that can be concluded.

Data analysis and Big Data

Mobile tools for analytics- charting and graphing on the road with Android
Byscientist July 27, 2013

When you have a business website, it is important to track visitors, the places they visit, the hits you get and other analytics that tell you more about whether your content is relevant and whether there is information that is useful to visitors or customers. Most companies make it a part of their daily, weekly…

Read More Mobile tools for analytics- charting and graphing on the road with Android
Data analysis and Big Data

Visualize large data sets
Byscientist January 4, 2013March 10, 2021

It is very difficult to analyze large data sets using statistical methods if the variation in data is high. The statistical method requires large samples to average out the noise. Even then to spot a pattern takes an enormous amount of time. However, sometimes the right visualization helps one understand the data very easily rather…

Read More Visualize large data sets
Data analysis and Big Data

Enable drug discovery
Byscientist February 21, 2021

Drug discovery is hard.Amazing to see the databases that are available for public access that enable drug discovery. Broad institute publishes The Connectivity map (CMAP)which is a database of gene signatures of transcriptional response to perturbation of many cell lines. This is incredible amount of data that is available in the public domain to be…

Read More Enable drug discovery
Data analysis and Big Data

Labor statistics the Big Data way
Byscientist June 4, 2013

Typically, labor statistics are collected tediously by the staff of Bureau of Labor Statistics by visiting, faxing and calling different stores, offices and online retailers in 90 cities across the nation and getting back nearly 80,000 prices for different items. This costs about $250MM per year and takes at least several weeks to put together….

Read More Labor statistics the Big Data way
Data analysis and Big Data

McGurk effect – you actually hear only what you see or want to hear!
Byscientist June 25, 2013

The video below explains it all. But essentially our perception is dependent on what we see. Wonder how many of these effects exist without us realizing it {youtube}FefFfvriAwQ{/youtube}

Read More McGurk effect – you actually hear only what you see or want to hear!
Data analysis and Big Data

Big online databases – Drug Bank
Byscientist October 3, 2012February 20, 2021

Drug Bank is a great database if you are at a pharmaceutical or Biotech company. This database has enormous drug information about all the FDA approved drugs and some others. Each entry contains more than 150 items including chemical, structure, pharmacological and more importantly, drug target information. This data is also available to download and…

Read More Big online databases – Drug Bank

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Similar Posts