Artificial intelligence (AI) has made huge strides recently in areas like natural language processing and computer-generated images – every other week seems to bring another breathtaking headline. Engineers, developers, and policymakers in the AI community are more seriously grappling with the fundamental risks that AI poses to society, like perpetuating unfair biases, putting privacy and security at risk, harming mental health, or automating tasks that provide livelihoods for people. As people flock to the fields of ‘responsible AI,’ ‘AI ethics,’ and ‘AI governance’ that are all about shaping AI towards what is helpful for humanity, it is time we ask: where are the ethnographers and applied anthropologists?
Many are doing ground-breaking work in AI, and reporting back to the EPIC community (see here, here, here, also here for just a few examples). But we are often still too narrowly identified with a role of delivering insights for UX and product design. If we want to influence the AI revolution that is sweeping the business world, it’s time we do more work with a new set of stakeholders, and with new types of outputs.
The discipline that studies people should have a substantial impact on the practice of making machines mimic (or supersede) the ways humans think and perceive the world. But this requires a few crucial shifts:
- First, that we strive for greater literacy in how AI systems work.
- Second, that we move beyond critique or fear or disengagement with AI, and build new partnerships so we can get involved earlier in the AI development process – not just fixing the problems or testing usability.
- Third, that we learn to translate ethnographic data and findings into outputs that are legible and relevant to a machine learning context: data labels, labeling protocols, proxies, success criteria, to name a few.
Here are three concrete ways we can get started shaping AI systems as a force for good:
1. Help engineers build the small datasets that machine learning algorithms are trained on
A few years ago, ‘big data’ was all the rage, data was ‘the new oil,’ and the general belief seemed to be that the ‘solution’ to better AI was simply bigger data sets. Many of us argued for the value of thick data, and for quality over quantity. Well, now ‘small data’ is in again.
The advent of synthetic data means that for many use cases, engineers can train AI on a small, high quality data set and with relative ease build a vast set of much more diverse training data. More broadly, there’s a growing movement of data-centric AI with leading voices in the AI community arguing that the algorithms we build are only as good as the datasets we have, and that it’s often more efficient to have one very high-quality small dataset than to have tons of questionable-quality data.
But how will engineers get those high-quality small datasets? There is an opportunity for ethnographers to help the developers of AI algorithms define, collect, and evaluate the highest quality small datasets available, based on a deep understanding of the factors that are really at play in anything that the algorithm has been built to do – the kinds of factors that are usually hard to detect unless you’ve spent time embedded in the context you’re trying to understand, and the kinds of factors that if you miss or get wrong, the whole endeavor fails.
For instance, data scientists work to detect and deter credit card fraud by pouring over the traces of fraud left in credit card transactional data. But they lack a first-hand understanding of what it is like to be a credit card fraudster. Good ethnography, leaning on the sociology of deviance, can help build datasets that draw from better sources and minimize bias and noise (such as the right pairing of real-estate data and traditional transactional data).
We can weigh in on success metrics too – what does it mean for a machine learning algorithm to ‘get it right’ from the perspective of the humans who will feel the direct impact of what the algorithm does? To build high-quality, small datasets that are useful to engineers, ethnographers ought to not only make theoretical sense of the situations people are in, but also capture, down to the smallest details, what happens. There is perhaps no higher quality dataset than the videos, photo, and text captured during ethnographic research in which the ethnographer has deep knowledge of the context.
2. Help policymakers and leaders set the standard for responsible AI
There are many biases and unintended consequences that AI developers and policymakers already know they should be on the lookout for and correct, and policies or regulations in-the-works to ensure those corrections are made. But what about unknown biases and unknown consequences?
There cannot be responsible AI without a deep social science understanding of the role and impact of AI in society. Ethnography is about uncovering and making sense of biases, hidden structures, dynamics, and systems of influence. All too often, well-meaning AI ethics boards, policy think tanks, or governing bodies are made up primarily by people who are experts at combatting already-known sources of bias, inequity or harms (such as the many well-documented ways in which race or gender biases in underlying data sets perpetuate real world inequities), but who may not be equipped to do the difficult work of identifying less obvious sources of bias or harm from the ground up.
Take synthetic data in particular: it holds the promise of greater anonymity and privacy protection in datasets, and the ability to detect and correct for biases in the original datasets, by creating more, synthetically, of what’s under-indexed. But it also has the potential to perpetuate a snapshot of the world that is inaccurate or stagnant. If we think back to the rise of big data, when large scale datasets became readily available, suddenly many business leaders and decision-makers turned to these as the ultimate source of truth. Ethnographers experienced this firsthand, having to explain the limits of big data in solving certain kinds of problems. Well, now synthetic data promises to be even more readily available, relatively cheap, and able to replicate even the kinds of data that otherwise would be too expensive or dangerous to obtain. So it’s likely to become the new default, off-the-shelf tool for decision-making used in situations where people know it probably isn’t appropriate, because it is comparatively cheap and easy. There’s an opportunity for applied ethnographers (and social scientists more broadly) to not only make sure synthetic data is built from the right high-quality real-world datasets, but also to help leaders set up ethical guardrails and clear business cases for when to use synthetic data, and when to avoid it.
3. With designers and product developers, go beyond features and functionality, and start talking about the relationship between people and algorithms
Social scientists and ethnographers study relationships – whether between people, communities, or people and the things and places with which they interact. The relationship we should now be studying, and helping to shape in our work, is the relationship between people, specifically the ‘end user’ of a piece of technology or software that’s been built that uses AI, and the algorithm that is fueling that technology.
This is especially the case with content-recommendation engines, the kind that suggests what to watch or do or buy online, based on what we’ve liked and done before. Increasingly, people are aware of how these algorithms shape their preferences and tastes. But people also shape the algorithm – it learns based on our actions. And people are becoming increasingly aware of this too – changing what they like, how they use hashtags, how they search – in order to influence the content they encounter.
There is a relationship forming between user and algorithm and all too often this relationship is described as harmful and toxic. In a study we conducted with QAnon formers (people who self-ascribed to feeling a sense of belonging or belief to aspects of the QAnon phenomenon), people knew they should stop watching the sensational and horrible stream of content that was recommended to them in their social feeds – by algorithms and ‘peers’ they had made. They reflected a feeling that the algorithm was creating a negativity spiral for them that was addictive and hard to break free from.
Ethnographers can help people to build better relationships with the algorithms they encounter and interact with daily (or help designers and product developers to do so). We can study the relationships between people and algorithms with the toolkit of ethnography used to study other forms of relationships. We can uncover and advocate for the tools, features, and design principles of a healthy human-algorithm relationship, or even help to one day build algorithms that are in the service of the user rather than just of Big Tech. We can demonstrate why platform developers and policymakers should care—because we can find in the field evidence of the ways in which the broken relationship is leading people to disengage online, or to feel worse when they leave the platform than when they logged in.
Ethnographers are good at making careful, considered, complex observations about the world, and we can turn those observations into impact. Whether it’s building towards a greener planet or a more equitable society, we are attuned to the biggest challenges today, and strive to solve them in human-centered ways.
One of our key focuses now as a group of practitioners needs to be on shaping the development of artificial intelligence. When we begin to see the similarities between ethnography and AI development—the value in small high-quality datasets, the uncovering of biases and unintended consequences, the importance of the role of relationships—we begin to see a way forward for the community of ethnographers in shaping a future in which AI is more helpful and attuned to humanity.