Data Artifacts

The term “artifact” has at least two meanings: From a technical perspective, an artifact is an unintentional pattern in data, arising from processes of collection and management. From a cultural perspective, an artifact is a designed object, with a social and material history. At metaLAB, which is grounded in both technical and cultural methods, we are examining digital artifacts with both meanings in mind. Recently, we initiated a new project, entitled Data Artifacts, to develop visual methods of revealing the often-unacknowledged patterns in digital data that speak to the social and material history of its accumulation. Never raw, all data carries traces of human labor, intentions and values. Data Artifacts is an inquiry into the deep history of digital collections.

Digital cultures, which devote vast resources to the harvesting and handling of data sets, can be understood in part through the particular ways in which they pattern data. Artists and designers with knowledge of computing are poised to uncover such data artifacts through visualization. Indeed, practitioners like Aaron Koblin, Ben Fry, Laura Kurgan, Carlo Ratti, Mark Hanson and Ben Rubin, are pioneering new forms of visual craft around digital data. These creative technologists leverage the traditions of art practice (visual composition, a hands-on maker culture, and a history of expressive precedents) to render data sets more accessible. Furthermore, their work exhibits an orientation towards the public domain, transitioning easily from research labs to museums and mass media. However, most formal approaches to visualization call for data to be filtered and standardized at the outset. In contrast, we focus on the heterogeneity inherent in human-made data. The messiness of data sets can tell us much about the history of their production.

We don’t have to look beyond our own university to see the mechanisms of data collection in motion. For example, we can learn from the artifacts emergent in one of Harvard’s most commonly accessed digital resources, its open library data. Today, in 2012, there are over seventy libraries at Harvard, each with its own extensive collection. HOLLIS, the Harvard Online Library Information System, allows patrons to search for select volumes, but it does not afford panoramic views of the entire holdings or reveal macroscopic patterns in the acquisition, distribution, circulation, and citation of the university’s collections over time. The ambition of Data Artifacts is to develop new tools to contemplate such large-scale collection processes and enable richer discussions about their technical and cultural significance.