It’s All Too Much: Data Existentialism

What to do? What to do? The data-fied world is just overwhelming. Within the span of one class discussion on how to define data, an infinite amount of data was generated. For a moment, everything became data to me–I started to think about the number of individual fibers in each separate strand of polyester yarn in the carpet, I wondered what the most frequently used word would be of that conversation, I thought about the changes in frequencies and amplification of the sounds made by the lights, projectors, computers, microphones, and HVAC, I considered the rough demographic breakdown of the institute participants, and (of course) the breakdown of Mac versus PC users. Once I clambered out of this data rabbit hole, I fell into a deeper, messier one: what does all this data mean and why do I need to know any of it? And, based on the next four hours of class and what I’ve read, it seems my momentary, existentialist data-crisis is typical of how art historians and humanists experience the use of data in their respective fields.

The most difficult piece of the data problem to overcome might be accepting that the use of data does not guarantee empiricism (and that’s good for those of us who like to work with nearly unanswerable questions!). We have been duped by data: the huge breadth of information and the “tidy” organizing structure of a data set give an illusion of precision and definitiveness. However, I know from my experience working on client-facing budgets for a wedding and event company that it is easy to make an Excel document say whatever you want it to say. Data is easily abstracted and fictionalized.

Any aversion to incorporate the analysis of large data sets into humanities research for fear that it would take the wind out of a question, an object, or a subject can be assuaged by the fact that the inverse is also true: data can be collected and combined to a point at which it becomes meaningless. The majority of the data collected about us is never put to any use, and if it is there’s little guarantee that it leads to insight that tech companies, marketers, political parties, etc. can capitalize on. As an example: In the midst of a recent conversation with an insider at a prominent born-digital art website a woman from the group asked if he (the organization) could see how many times an image was shared, how many times a title was clicked, or how many times a user logged in. “Yes,” he said (and I’m paraphrasing here), “but at a certain point you realize that just because you have that information doesn’t mean it’s going to do you any good. We had to learn how to only assess the information that pertains to the company’s goals.” I’ve experienced the flutter of excitement over rows and rows of information in my own work only to realize that the data pertaining to my goal was incomplete and would need to be assessed the old fashioned way. All the leftover data just provided me some fun “facts”. And to briefly take this point outside the art world I’ll mention that the truths about data surplus come up regularly when I’m speaking to a certain crafty digital marketer that I know well.

My question–what to do?–is unanswered. I am no closer than I was a week ago to knowing what data I need to collect to answer my research questions. On the plus side, I feel liberated from certain confines about data and the uses of data analysis in my work and I can rest comfortably knowing that data has been a latent and lurking presence within much of art historical scholarship. And as an added benefit, I am grateful to know that OpenRefine will be there for me when I need to clean and organize all the data I am bound to generate from dabbling in the quantified (but often qualitative) humanities.

What does it all mean…?

