A Highly Recommended Article: What Data Can’t Do

Given the current emphasis on “big data” it is important to recognize what data “can and cannot” do. This was recently addressed in a very perceptive article by the New York Times’s David Brooks. While acknowledging the benefits of gathering and analyzing huge amounts of data, one needs to recognize limitations. Brooks cites a number of things that he believes big data does poorly:

  • Data struggles with the social – “it’s foolish to swap the amazing machine in your skull for the crude machine on your desk.”
  • Data struggles with context – “Data analysis is pretty bad at narrative and emergent thinking, and it cannot match the explanatory suppleness of even a mediocre novel.”
  • Data creates bigger haystacks – “The haystack gets bigger, but the needle we are looking for is still buried deep inside.”
  • Big data has trouble with big problems – “For example, we have had huge debates over the best economic stimulus, with mountains of data, and as far as I know not a single major player in this debate has been persuaded by data to switch sides.”
  • Data obscures values – “Data is never raw; it is always structured according to somebody’s predispositions and values.”

Brooks has made a number of interesting points. I’m sure there will be controversial points of view among the readers of this blog!