How Big Data Can Ruin True Statistics – Storagecraft
(This article was originally published at Access to Statistics, and syndicated at StatsBlogs.)
Big Data presents a lot of opportunities for information discovery. The world has begun creating billions of bytes of data, which can be analyzed and utilized for everything from marketing to scientific research. But as Voltaire said, with great power comes great responsibility.
People have always been able to manipulate data and create true, yet absurd statistics. But Big Data makes it even easier. According to a recent Wired article
by Nassim Taleb, a risk engineering Professor at NYU, Big Data has brought cherry picking to an industrial level. Although researchers and Big Data analysts can now understand and use information in new ways, it’s also easier for them to misuse it in new ways. With Big Data and its vastness of information, statistical correlations can be found simply because of the size of the data sets and not necessarily because the correlations are genuinely valid.
Please comment on the article here: Access to Statistics