Advancing the Value of Ethnography

Big Data Anxieties: From Squeaky Dolphin to Normcore


Good afternoon, everyone. It is a complete pleasure to be here. Can I say a very special thank you to Timothy and to Ken for inviting me to join you here this year. It’s been really interesting to hear so many fantastic talks. My name is Kate Crawford. As you heard, I work at Microsoft Research and at MIT out of NYU, and so I’m one of those multi-affiliation people. What I research are spaces where data, algorithms, and people intersect. Over my research life, that has included things like looking at how flagging systems work on platforms through to how we understand and interpret crisis data, through to high frequency trading algorithms. What I’ve been doing for say the last four or five years, is in looking at the politics and ethics of big data, specifically looking at how we might consider the epistemologies and ethical systems that are emerging with this new set of technological practices. But today I actually want to talk about something different. I want to talk about the effects of big data; In other words, how big data makes us feel. I want to try and answer this question:

What will the lived reality of big data feel like?

To try and answer this question, I’m going to do something a little bit unusual and so I hope you’ll bear with me. I’m going to try and bring together two very different documents. They were both released this year, that had big cultural impacts but for very different reasons. One is from the British Intelligence Agency, GCHQ. The other is from the trend forecasters and marketers, K-Hole. By bringing these two documents together, what I’m hoping to do is generate some friction, and hopefully a little bit of tension around what might be the emotional effects of surveillance. I’m also going to talk a little bit about what role we might play as researchers, as ethnographers. By we I’m specifically also talking about those of us who collect data, be that interview data or social, mobile, and geolocative data, but that comes later.

First of all, let’s go to Squeaky Dolphin. We first found out about it back in January of this year. It is a truly Thomas Pynchon-esque type of code name which was devised by GCHQ to name their secret program, which was effectively extracting views of Facebook and YouTube and connecting that to unique usernames. Now, of course, that’s just one of many mass data collection programs that we found out about thanks to Edward Snowden. I think what’s really interesting is that this particular Squeaky Dolphin PowerPoint deck, which I think is telling us something quite different. It’s outlining essentially a very expansionist program to bring big data together, with the more traditional approaches of the social and humanistic sciences.

Of course, GCHQ has a very spook type of name for it. It’s called the Human Science Operations Cell. It’s all about supplementing big data analysis with broader sociocultural tools from fields like, as you can see, psychology, anthropology, sociology, political science, biology, history, and economics.

You can see as the slides roll on, that the disciplinary fields proliferate from 7 up to 22, depending on whether or not you count the fact that ethnography appears twice up in the left-hand corner, which I think is very interesting. Also interesting to note is that deception and magic are prominently appearing there, which is definitely a talk for another time.

But I think that this doubling of ethnography is actually kind of fascinating. If this PowerPoint deck was a poker game, I’d call it a tell, because this profusion of techniques from the social and humanistic sciences is a clue about what is missing from big data. I want to suggest that in some ways this deck that I’m going to share with you, is in fact an extraordinary testament to anxiety. It’s a tale of anxiety that has been told in a lot of different ways, from network diagrams and social graphs, through to cartoons about the Arab Spring through to, of course, LOLCats. Yes, indeed, GCHQ I Can Has Cheezburger.

What’s so interesting about this deck, of course, is that you can almost use it as a type of PowerPoint karaoke. Has anybody here done PowerPoint karaoke? Yes, one, two—maybe three. For those of you who haven’t, it is a fantastic game to play after a couple of beers. You put up a set of completely unfamiliar slides from a totally different discipline and you have to try and adlib a talk. Let me recommend that the Squeaky Dolphin deck is going to be ideal for parties. Yes, it’s really good fun. I’m going to let them keep rolling behind me.

What we do know about GCHQ and the NSA, is that they are the old gods of big data. Despite their enormous budgets and huge technical infrastructure and some would say almost extra-legal power, all of that data for them right now is not enough. They are reaching for these other disciples and methodologies in order to make sense of the data that they have. They are urgently looking for types of context, what people were thinking and feeling in order to understand the data. Now, some have referred to big data as representing the end of theory. I think what’s interesting about decks like these is that they show us that big data is a theory. It’s effectively an emerging Weltanschauung, a world view that is in itself still very emergent and changing. While this concept of big data can feel industrialized and concretized, it is very much in its infancy. Effectively, they’re still working out how to do it best.

What does big data even mean?

Well, obviously this term has been hiked beyond any sensible and precise meaning. Tom Boellstorff I think puts it very well when he says, “Even though there is no unitary phenomenon of big data, the impact of big data is nonetheless real and worthy of sustained attention.” And Lev Manovich back in 2010 tried to define big data. He used the idea of any dataset that used to be crunched on a supercomputer, but is not used on a desktop computer and the cloud—which if you think about it is a very shifted kind of definition.

What counts as big? Is it something that’s not small now or not small in five years, or in 50 years?

Obviously, this idea of big data is very historically, and I think in many ways, spatially contingent. In that sense, I think what I’m most interested in is how we think about big data as being basically sociotechnical, a set of critiques, techniques and practices that bring together developments in technology and analytics, as well as political, economic and social practices. And then I would also suggest mythology. There is a powerful mythology around big data at the moment. It basically spins its own ideas, but somehow [primer facing] more data is better, and its social data can give us direct insight into how people really think and feel.

Now, this kind of mythology is, of course, old news as Donna Haraway wrote way back in 1983. “Myth and tool have always mutually constituted each other.” She describes what she saw as a sort of common move in the technological sciences of translating the world into a problem of coding where differences in the world are, and I quote, “submitted to disassembly, reassembly, investment and exchange, an informatics of domination.” I tend to think that Haraway’s work just nails this kind of common move. She reminds me that these tools of data gathering and analysis from the dashboard to the scraper are also agents that are shaping our understanding of the social world.

Bruno Latour, of course, put it differently. He said, “Change the instruments, and you change the entire social theory that goes along with them.” In this sense, the turn to big data is a social and political turn, and I’d argue that we’re just at the very beginning of starting to see its scope. But what I want to suggest today is that this lived reality of big data is ultimately suffused with a type of surveillant anxiety; the fear that all the data we’re shedding every day is somehow too revealing of our intimate selves, and that it may also at the same time misrepresent us.

Sianne Ngai used this lovely idea of anxiety being an always forward-looking emotion; that somehow like a fluorescent light in a dark room, once showing too much and somehow not enough.” But I think that surveillant anxiety is always a conjoined twin. The anxiety of those surveilled is deeply connected to the anxiety of the surveillers, but it’s very hard for us to see the anxiety of the surveillers. I mean, it’s generally tied up in classified documents; mentioned in coded languages, in front of Senate committees, which is in some ways why Edward Snowden’s documents are so remarkable. It’s our first real insight into the kinds of obscured concerns of intelligence agencies.

Now, obviously there is this enormous structural paired symmetry between the surveillers and the surveilled, but neither of those with the greatest power are free from being haunted by particular types of data anxiety; that no matter how much data they have, it’s always incomplete and that the sheer volume can simply overwhelm critical signals in a fog of possible correlations, or as this slide suggests, that they’ll get stuck in a hollow point where people and moments and things can just escape. This is why I think intelligence agencies are now grasping for the tools of small data as an attempt to mask over the big data black holes. From the Boston bombings through to Malaysian Airlines flight 370, we know that big data black holes exist. A plane full of passengers can indeed simply go missing.

I think that the bigger the data gets, the more small things that can be overlooked. This is right now generating real problems for intelligence agencies. This comes directly from one of their more recent documents that was released. It’s a pretty directed mission. They’re effectively saying that all of this data is coming with enormous technical and ethical difficulties. I think that I particularly like the way that norms is here in scare quotes where it’s clearly outpacing the kinds of norms which have been established in terms of how you use data.

But this dark message has been delivered to us in ways of most prosaic forms, the PowerPoint deck. Now, PowerPoint has been the corporate workhorse that has been the great conveyer of all of the Snowden leaks. I think that this one is my all-time favorite. It’s like the urtext of PowerPoint. Over time data goes up. How many times have you see that? I think that everybody in this room has seen this slide in some form or another, but what is interesting is that PowerPoint’s affordances and limitations have shaped our very understanding of NSA’s practices. There is also something kind of modest and almost apologetic about the NSA slides. They’re often rushed. They’re never laid out by graphic designers, but by government employees on a deadline trying to explain why all of this data that they have is simply not enough. They’re trying to do that with clipart and drop shadows. That’s a tough task. I think that in this sense they’re also in some ways documents of workplace anxiety and inadequacy. I think for me the most significant deck, and certainly the most significant deck for my argument, is the next one because of its most resonant and uncertain question:

What can we tell?

Fittingly, half of the slide is blacked out, presumably for national security purposes, but perhaps it was always a black empty space. You can’t quite tell, because in many ways this is the most difficult question. I think that for me it helps us jump to the other side of the dialectic in the anxiety of what it is to be surveilled where the question becomes:

What can they tell about me?

So to see the consumer side of surveillant anxiety, we can turn to K-Hole. They are trend forecasters, and many of them come from marketing. They’re based here in NYC, and their work tends to straddle this gray zone between art and advertising. So where the NSA PowerPoints were sort of designed in a hurry, K-Hole’s slides are all about the kind of sly knowingness. In that sense, there is always a wink and a nudge. Its high street design is effectively reaching a type of in-joke perfectionism. So then even though these glossy PDFs can tend to look a little bit like spec work for a lifestyle marketing client, what I love about them is that they always have this kind of cultural criticism like a concealed weapon, if you really sit and look at them very closely. Let me give you an example.

When they released this report called Youth Mode, they launched the term normcore into the world. They wrote this about it. I quote “having master difference the truly call attempt to master sameness.” They’re effectively suggesting that fitting in with the mainstream is the ultimate camouflage. Normcore is not about irony. It’s not about looking normal, but it’s about trying to find freedom by looking like everybody else, or in their terms, “in being nothing special.”

Their PDF depicts a smartphone user scrolling through emoji, themselves if you think about it, a type of normcore system of emotion, a taxonomy of feelings that are laid out for you that you could just pick off of a grid menu. Wild terms like normcore can sound like the “lexicon of dickheads,” which is the term that Huw Lemmey has used—whom I’m quite fond of—I think it’s actually part of something much smarter which is K-Hole’s broader account of the complexities of life in an ever-present consumer market. In other words, they use this concept of normcore to gesture towards something which is much more ambiguous and interesting. I think that it captures this moment where mass surveillance meets mass consumerism. It reflects this dispersed anxiety of a population that wishes nothing more than to shed its own subjectivity.

Normcore, as it was invented by K-Hole, has this kind of eerily accurate model of cultural anxiety. For that very reason, I think it’s worth actually taking seriously as an idea for this particular time that we’re living in. But then something very strange happened. It was adopted within less than a week by the fashion press. Suddenly, they took this idea without its kind of sociological bite and now it’s being called something like the look of nothing, or so-called dress normal. An article in the New York magazine by Fiona Duncan noted that our kids are now dressing in really boring clothes, and I quote, “I could no longer tell if my fellow SoHo pedestrians were art kids or middle-aged middle American tourists.” So cool is apparently now about nonironic t-shirts, New Balance sneakers, Nike caps, and Jerry Seinfeld really high-waist jeans. Where K-Hole’s account of normcore was about adaptability and strategic misinterpretation, the fashion press made it more about how to champion generic looks into high style out of dork wear.

I think that this rapid rise of the term normcore is in itself an indication that the cultural idea of disappearing is becoming cool at the very moment where it’s becoming almost impossible, and so blending in gives you enormous power in 2014. That’s particularly true if we think about the sorts of places where you might stand out; for example, if you are Rihanna Ibrahim, who is a PhD student at Stanford who was stuck on the no-fly list for 10 years because of a mistake, or if you are Robert McDaniel who was a 22-year-old who was on a predictive heat list that the Chicago Police had produced, which meant that he would get people turning up to his door, knocking and saying that we believe you might be somebody who could be involved in a crime, even though he at that stage committed absolutely no crimes at all.

In these contexts, who wants to stand out in a dataset?

I think here I tend to remember the other New York group that really is strongly recommending dressing like a tourist. That was Occupy Wall Street. Indeed, Occupy Wall Street hosted civilians workshops, and this was all about dressing like tourists so that the police wouldn’t come and hassle you. This was actually an extremely effective, and I think very clever tactic. So while Occupy Wall Street protestors were dressing like tourists to try and avoid a specific threat, normcore is meant to be much more dispersed and continuous and being essentially permanently inconspicuous. So what was a temporary tactic has now become an ongoing strategy for K-Hole. What used to be about confusing the NYPD, the barricades, has now become a matter of confusing both the fashion press as well as the sociotechnical matrices of always being seen, Instagrammed and tracked.

But being able to blend in or pass is a very exclusive form of privilege. In the words of Cat Smith who researches issues around disability, “the look of nothing is never going to be available to those who are marked as other. If your skin is the wrong color, if you have a strange name, or if you’re in a wheelchair.”

Who gets to be normcore? Who gets to be just one of the mass? Just one anonymized data point among millions?

Here I think Gap as its most recent campaign that came out last week, is suggesting that Angelica Huston is our model of normcoreness, but perhaps in a more classic formulation who has nothing to hide.

I think that there is a really rich history to answer these kinds of questions, and it’s coming from the sorts of things that I find most useful. They’re coming from ethnography. Here I’m thinking of the work of people like Virginia Eubanks, who did a long study of how particularly low-income communities shifted towards electronic benefit cards, where initially this was seen to be a very powerful step. You could go to the grocery store and you could buy food holding a piece of plastic like everybody else, which gives you a sense of respect. It gives you a sense of autonomy and that you’re fitting in. Of course, what these cards were doing was also connecting them to this extremely rigid and tight surveillance network, which goes hand-in-hand with those very communities also being exposed to higher forms of police and camera surveillance.

I think that in some ways, while these data tracking techniques are now being pushed out to include much more of the population, the greatest impact is still being felt by marginalized populations. In reality, while normcore can trumpet the value of being invisible, the only really invisible thing is the infrastructure of the data collection itself. I think that this invisibility matters, because as the artist who created Shadows of the Drone, James Bridle once said, “Those who cannot perceive the network cannot act effectively within it.”

I’m going to give you a recent example. This comes from Ukraine. This is back in January 21 when they had a massive street protest, and in fact, the protest was against laws restricting street protests. It’s possibly recursive given what happened next, which is that everybody at the protest received this message. For those of you who can’t read it, it says, “Dear Subscriber, you are registered as a participant in a mass disturbance.”

I think that in some ways this represents the real politics of big data. Let’s be clear. This is incredibly cheap to do. There is a fantastic paper actually that came out last year by Ashkan Soltani and Kevin Bankston that looked at the cost of mass surveillance techniques vs. actually having cops on a beat. What they showed was that it was 6.5¢ an hour to monitor a person electronically, and I think now in the most recent data, it’s actually less than that; it’s about half of that to around $275 an hour for in-person. That means that there is a really strong current of economic efficiency underlying this desire for big data to model the social world, and I think that’s true for both state and corporate actors. I think that it also has the potential to create a kind of anxiety that influences the idea of what democratic participation looks like.

After all, would you attend a protest if you knew that it was going to put you in these kinds of positions? Where you knew that your number was now being tracked? What about if you left your phone at home? Would you go to a music concert, for example?

This is a story that came from a year ago in Boston, a little bit closer to home. There was a big outdoor festival called Boston Calling. Thousands of people go. The Shins were playing, the Dirty Projectors were playing. All of the kind of so hot right now indie bands. It was packed. But what nobody knew is that that was the time that the city of Boston decided to run a pilot of a particular new surveillance system. What they were going to do is capture the faces of every single person who went to that concert and run particular forms of facial recognition against them, particularly looking at issues like skin color; whether they were glasses; their possible age; whether they were balding for some reason was very important, which is interesting. Nonetheless, nobody who went to this concert, including the promoters, had any idea that this was happening. No one signed a form. There were no signs. This was just a test that was being conducted by the city of Boston.

Again, I think that this starts to raise difficult questions about what it is to be in public space, when you’re living in these highly instrumented cities. We’re starting now to see the technology sector begin to respond to these kinds of anxieties. Initially, they have done so with a whole tranche of different apps. For example, Snapchat, Whisper and Secret. These apps offer you the potential to cloak who you are and to essentially keep your messages (unintelligible). Now, that’s an illusion, of course, to imagine that this constitutes real invisibility, but it does offer this way of participating without standing out. I find it particularly interesting that for Secret, the biggest uptake has been with venture capitalists and startup men in Silicon Valley. The reason why, of course, is that you can share secrets. You can say things about your fabulous lifestyle, but you can also drop a few tips about a company they might list, or throw a bit of shade against a competitor. In some ways, these are becoming spaces where the privileged want to blend in, while still being able to lash out.

If we take these twinned anxieties of the surveillers and the surveilled, and we push them through to the natural extension, what we reach is a kind of epistemological limit. On the one hand, the fear that there can never be enough data, and on the other the fear that you are standing out in the data—they reinforce each other in a kind of feedback loop that tightens with every turn of the ratchet. I think in this way, some people are now saying that anxiety is becoming the spirit of our age. Here I’m thinking of the British labor rights group Plan C, who released a fantastic manifesto called We Are All Very Anxious. What they do is they look at various stages of capitalism. They say that each one has a dominant effect. They argue that the one we’re currently living in is all about anxiety.

The thing about a dominant effect is that it functions as a kind of open secret. Everybody knows about it, but nobody really talks about it. Instead we see it bubbling up as kind of inside jokes or in a particular kind of artwork, or in coded expressions like this one. This is a quantified toilet. This debuted at the CHI conference earlier this year, and it offered to capture all of the urine of the conference attendees for analysis and to turn it into a data stream, I kid you not. Yes, it gets better! They then put up signs at all the toilets so that people knew that the urine was being collected. And if you then went to the website on the sign, you would see this. This is a live urinalysis of everybody at the conference updated by the second. It detects any recent drug or alcohol use, as well as the pregnancy status of all attendees. Now, this of course, is a prank. But it was convincing enough, and the technology is pretty much here and so it can actually be done. The people were convinced. I’m presuming that those people were the ones who actually just held on until they went back to their hotels, and sort of assuming a kind of model of data retention here that’s very different to the one that it’s generally referred to. I think that it’s in those kinds of jokes and these kinds of provocative interventions that we can start thinking about the kinds of ethical challenges that we face. And by we, I mean us in this room.

What kinds of tools do we use to capture data? What kinds of questions should we ask about those tools?

I think that these are the kinds of heuristics that help us consider our role as essentially people who listen, whether we listen via traditional ethnographic techniques, qualitative methods, or in using forms of social data. I think that it raises the question of how we are listening, and whether or not we are actually eavesdropping. There is this long history in ethnography of thinking about power relations. I think that while it’s far from perfect, things like the consent form stand as a particular kind of document as to how we conceptualize agency-to-participant in research processes.

What happens when we’re using data from other less direct sources? Whether it’s through data traces, social media or through a sensor in a café or a department store?

This is where I think the ethics of listening becomes extremely complex, but to me really interesting. It’s these kinds of questions that haunt ethnographers and social scientists every time that we set up cameras and sound recording gear in public, but also when we scrape Twitter and Facebook data, or use Amazon Turk workers in India to answer our questions and surveys.

Who gets to be the listeners? Who gets to choose when they no longer want to be a participant? What is the ethical frame that we bring when we use these kinds of sources?

Right now as people are thinking of effectively more ways to blend in—be it through normcore dressing or hardcore encryption—we are seeing the development of these increasingly intrusive forms of data collection. While I think that there is clearly considerable potential in this data, we are just at the very beginning of thinking about how it can and should be used. I tend to think that it’s an ideal historical moment for us to be having this kind of cross-industry, cross-disciplinary conversation about what data ethics is. In many ways, despite the anxieties produced by mass data collection I tend to be very hopeful about this. In particular, I see the greatest potential for collaboration between data scientists and ethnographers, propelled by exactly the kinds of conversations that I’ve been hearing in this room today. What I’m going to do now is throw the floor over to you in the hopes that we can start to answer the big question.

What will the ethics of listening be in these new data-heavy worlds?

Kate Crawford is a Principal Researcher at Microsoft Research, a Visiting Professor at the MIT Center for Civic Media, a Senior Fellow at NYU’s Information Law Institute, and an Associate Professor at the University of New South Wales. @katecrawford