Intel/ISTC Workshop: New Forms of Data Work

Intel-Labs

On March 10th 2014 I hosted an Intel/ISTC workshop at the Intel Jones Farm campus on “New Forms of Data Work.”  The event brought together researchers from multiple universities in the US and Europe and researchers from multiple divisions within Intel to discuss their research and generate conversation around emerging forms of data-centric work.  Contact me for more information about the event.

Description:

While the social, cultural, ethical, and technical implications of big data science is an emerging topic of research, we lack empirically grounded studies to understand the individual practices,collaborative routines, and human assumptions that create mineable ‘big’ data resources.  And we know even less about the social and cultural implications of these emerging activities.  Researchers have claimed that “data is the digital air in which we breathe” (Boyd & Crawford, 2011).  But, such statements overlook the fact that creating data resources often requires a large amount of situated, complex work.  Currently, new forms of data work are emerging at all stages of the lifecycle of data, from infrastructure to entry to aggregation to computation.  This work is both private and public, for free and for pay, voluntary and coerced.  Such work includes, for example, laborious practices of culling, coding, and linking birth certificate data for large-scale research and quality analytics, monitoring and logging bodily data using personal sensors, “e-doctoring” patient data from a distance, and designing algorithms for music recommender systems.  Additionally, individuals and firms are attempting to capitalize on the real and speculative potential of data resources through developing various kinds of data repositories, tools, and services.    Ethnographic research on new forms of data work can illuminate the landscape of new developers and ecosystems for data creation at multiple levels, from the home to the workplace to large industry and public institutions.  This workshop will provide an opportunity for Intel and Intel-affiliated researchers to present research on emerging forms of data work.

Program:

Theme 1: New forms of data work of the self and in personal life

Judith Gregory (UC Irvine)

Jamie Sherman (Intel)

Kathi Kitner (Intel)

 

Theme 2: New forms of Organizational and Occupational data work

Katie Pine (Intel)

Nancy Vuckovic (Intel)

Cory Knobel (UC Irvine)

Pernille Bjorn (IT University Copenhagen)

 

Theme 3: Developers and development of emerging data tools & services

Alex Campolo (New York University)

Dawn Nafus (Intel)

Nick Seaver (UC Irvine)

Matt Bietz (UC Irvine)

NSF Award

NSF logo

 

Exciting news!  My research project on the micro-foundations of big data in healthcare organizations has been funded by the National Science Foundation.  The project, “Creating a Data-Driven World: Situated Practices of Collecting, Curating, Manipulating, and Deploying Data in Healthcare,” will study the situated practices, human assumptions, and organizational routines that transform “little data” into mineable stores of “big data” harnessed for measures and metrics. Currently, our knowledge about the origins of big data and what goes into collecting, curating, manipulating, and deploying these huge information resources is limited. We know even less about the social and cultural implications of these activities, particularly in non-academic contexts. A growing group of scholars urge critical interrogation of the methods, analytical assumptions, and underlying biases of big data science. A nuanced understanding of the situated practices through which big datasets are assembled and manipulated is required before we can comprehend their social and political implications, particularly if we are to evaluate the quality of the scientific results based on analysis on the manipulation of such datasets.

The research will be carried out through a multi-sited ethnography of obstetrical data production in healthcare, an area where big data and associated metrics are both important and problematic. First, it will examine the situated practice and lived experience of creating the massive amounts of information that come to form the datasets. Second, it will trace how the results emerge through automatized measures and algorithms and affect the very environments they are supposed to reflect. This research spans the lifecycle of data. It will investigate how information is collected by practitioners, clerks, and coders and transformed into local repositories of supposedly “clean” data to be manipulated by performance improvement specialists. It will then trace how information is transferred and refined further in a statewide data center and deployed by a major quality improvement organization. Finally, the research will follow the aggregated data back to the local hospitals themselves and assess how data visualizations and performance measures affect local decisions and hospital functioning.

The broader impacts of this project include both near and long-term benefits. In the short term, this research will benefit the individuals and organizations struggling with questions about how to organize local resources to produce and deploy big data in service of management and performance improvement goals. In the long term, this research will generate foundational conceptual models that help to create design recommendations and practice guidelines regarding the social, ethical, and political implications of creating and using big data.

Collaborators:
Melissa Mazmanian
Mary Lowry
Chris Wolf