There has been a good deal of discussion of the relationship between the EPIC community and new practices of big data. Will the data scientists have the final word on what people value? Are we ethnographers effectively getting disrupted by cheaper and worse data? In a wider sense, what kind of a culture would we live in when stories of lived experience get increasingly sidestepped in favor of a newly re-empowered aggregate? Story would surely still matter, but the population of people in any position to tell stories with data would narrow drastically. This is not an inevitability, of course, and members of the EPIC community have written about reclaiming quantification in various ways (above, also contributions from Neal Patel and yours truly here).
It turns out we are not the only ones asking these larger questions. The Quantified Self community is too, albeit for different reasons. I began my research in quantified self, admittedly, because the name alone suggested some of my worst fears about what technology would enable. Indeed, the press coverage would lead you to believe that to be a self-tracker—which could involve wearing several wearables, or doing some microbiome testing alongside keeping track of food, say—is to rather doltishly follow pat advice provided by devices. Some even talk about a tech-reliant “infantilized self,” when people become incapable of making decisions without technology telling them what to do. As always happens in ethnography, once you actually talk to people you find out there is so much more to the story, and a perfectly good rationale for why people do what they do. Indeed, I now no longer self-track just for participant observation purposes, but for personal reasons too. This leads some to tease me that I’ve gone native (some=ken anderson), but going native helped me get through a recent injury that would have been much more difficult to deal with if I did not have that skill. I’ll live with the accusation happily.
Early in my research, a self-tracker and MD told me: “Clinical trials are great for figuring out the side effects of a drug for the first 500 people. The problem is, you could be number 501 and have a very different reaction.” For some in the QS community, data is about figuring out what is going on when the medical community does not consider your issue to be a real medical problem, or otherwise considers the bulk of what you need to do to be a matter of self-care, not clinical intervention. For others, it is a recognition that we are all, in a sense, number 501 in one way or another. Lifestyle advice proliferates faster than anyone can take it up. What “wellness” is to me is different than what it is to you. One size does not fit all, and so figuring out what new ‘healthy’ thing actually works for you, and what is realistic for you, is a very real problem.
How, in these circumstances, might a self-tracker interpret all the data she has collected? If the issue is what is right for you, then data interpretation can only meaningfully be done in context, not in aggregate. This raises more fundamental questions about representation. When data can be both individual, and potentially aggregated across many people, who does and does not have a say in what “the data” ultimately means? Even with an infinite N, there will be contested claims about who is an outlier, and what is and is not a real problem. If contested claims about what ultimately “counts” sounds eerily familiar to ethnographers, it might be because Quantified Self is facing a similar problem: contextual knowledge, which both ethnographers and self-trackers value, risks getting left by the wayside in institutional urges to aggregate.
Ethnographers often see quantification as a way of taking things out of context, and their role as one of putting the context back in. My research colleagues and I found in QS that numbers can also play an important role in grounding interpretation back into context, provided the interpreter and data generator are one and the same person. For example, when my own data showed that I consumed more calories on Mondays and Wednesdays—exactly the days when my partner taught later into the evening, and we decided to eat out—the temporal pattern created a link between calories and a lived life that could not be inferred from the data alone. Only I could know what “Mondays” meant. Of course one could “data-ify” this, and look for a way to track restaurant attendance, if one wanted. This solution is quite easy to spot when you already know what you are looking for, and harder when you don’t. As the late self-tracker Seth Roberts used to say, “you can’t falsify a hypothesis you don’t have.” People always know more than their data does, and are much better positioned to create those hypotheses than someone sitting behind a desk.
Through this work we realized that there were in fact plenty of people who were interested in getting beyond the canned, fixed representations of data provided by apps makers, but were not necessarily interested in learning statistics or experimental science. As an anthropologist, I began to think about this disinterest as also insistence on re-valuing the situatedness of situated knowledge. That is, it is also a recognition that not all problems can be reduced to matters of scientific or positivist enquiry. At stake is a person’s biography, how a person thinks of themselves, and the problems they choose to solve. Self-trackers appropriate scientific practices for their own ends, but they also go beyond them.1 They turn to artistic and narrative practices as modalities of making sense of numbers, text, and visual forms, while comfortably tacking back and forth between these and science-inspired experimentation. More rarely, some do use the scientific aspects of what they’ve done to speak back to institutionalized research, but it is a speaking back with a difference, grounded in lived experience as much as the science.
I got together with a group of engineering colleagues to figure out what we could do to support people who are interested in asking questions of their own data, in order to help them find new ways of looking at it. We knew one simple way would not suffice, given what we know about the diversity of QS practices. There would have to be many different ways to slice and dice, and iterate until something jumped out. We took the approach that people could be trusted with some complexity—they lead very complex lives juggling and quantifying complex phenomena, after all—but that a data processing tool would have to explain its terms in non-specialist language. It would also have to pay attention to the cycles that most evoked context, which meant surfacing patterns in time as well as patterns in space. Finally, it would have to contain lots of opportunities for “human override” for when people know that their data is wrong, or needs to be contextualized through annotation or other means. Anthropologically, I grew quite excited about all this as an opportunity to explore, in material form, what it might look like take up Latour’s call for visualization tools that surface matters of concern, not matters of fact. Data being processed is data neither raw nor cooked, and this seemed the moment to take that up. As this is a research project, and not product development, we were fortunate to be able.
The resulting data processing tool, Data Sense, is now in beta. We welcome members of the EPIC community to explore it for themselves. I write about it, however, because it raises a number of questions for ethnographic praxis more generally, inside and outside of industry. What if, as the current academic fashion for Tarde would suggest, ethnography involved more numbers rather than less, but those numbers did a different sort of work? We might look at numbers not for their Durkheimian properties—a summation across a population—but as a living part of the relations we were studying. I’m not sure that our tool goes “full Tarde,” exactly, but it does provide scaffolding for people to assemble their own stories from numbers that also performed a role in those stories.
That scaffolding could be productive whether one is sitting alongside a researcher or not. Although we did not design it for this purpose, there is the possibility is for ethnographers to use tools like this to jointly make sense of data with the people we study. That is, instead of ethnographers’ role being the contextualizer of large datasets in a diffuse way, we might be able to attempt, in data, the kind of representational equal footing that we also sought in text in the mid 1980s. It is, after all, “their” data, and “they” will have something to say about it as much as “we” do. Having a boundary object in data (whether textual, numerical or visual) could support our more traditional role of giving voice to the things people care about, particularly when others are not necessarily ready to hear them, while smoothing the translation between human lifeworlds and the world of data.
Finally, tools like this might also open up a broader conversation about data “cleaning.” Would we not re-value data “janitor work” if that data referred to ourselves, and we used our situational knowledge to throw out data that was patently false, and to reinstate missing data in ways that reflected our personal circumstances? One person’s “janitor work” is another’s curation of biography. I cannot help but wonder how much contextual knowledge is sanitized away by those more comfortable making guesses about what others intended.
As our world becomes ever more data rich, ethnographers could become allies in voicing the kinds of data stories that do need to be told, even when—or especially when—they don’t quite fit the rest of the bell curve.
1. This observation comes from Dana Greenfield. See Greenfield, Dana. (forthcoming). “Deep Data: Notes on the N of 1.” In Nafus, Dawn (ed) (forthcoming). Biosensing in Everyday Life: Getting a Sense of Things through Everyday Forms of Measurement. Cambridge, MA: MIT Press.