data science

Tutorial: Doing Ethnography of Data Science & Algorithmic Systems

Instructor: IAN LOWRIE Description Approx 1 hr 43 min. This video presents the lecture portion of a half-day tutorial. Case studies and a bibliography are provided for your use. Instructor Ian Lowrie describes the organizational and technological aspects of modern data pipelines, framing data science ethnographically as a knowledge practice and data scientists as a particular kind of expert. He also explores methodological approaches to studying data work in real-world contexts. Participants learned to: Think ethnographically about data work as a knowledge practiceDevelop methodological strategies for studying data workChart the organizational and technological components of data infrastructureInterpret the mindset, jargon, and practical orientations of their data scientist and developer colleaguesUnderstand how algorithmic systems and data analytics impact organizational structures, work practices, and business models In the second half of the tutorial, participants worked collaboratively to develop a pitch for...

How Is Evidence Created, Used & Abused? EPIC2018 Opening Remarks

by DAWN NAFUS (Intel), EPIC2018 Co-chair We chose Evidence as the EPIC2018 theme in part to explore this question of why some things constitute evidence and not others. There are lots of factors we could point to, but since I’m standing next to a data scientist the first one I’ll talk about is digitization. Digitization changes how people live, and it creates forms of evidence about people’s lives that we need to reckon with methodologically. Many of us are in the thick of organizations that handle some complicated datasets, traces of people and their environments, and so on. We’ve got to figure out how to engage with them, and I think that means we need new approaches if we are going to meaningfully intervene. The toolbox of user experience is only going to get us so far. So we’re going to need some friends, particularly those data scientists who are, like us, committed to the idea that datasets ought to be moored in some kind of social reality, and that they can’t just be built based on what’s expedient at the time. While...

Just Add Water: Lessons Learned from Mixing Data Science and Design Research Methods to Improve Customer Service

OVETTA SAMPSON IDEO Chicago and DePaul University Case Study—This case study provides an inside look at what occurs when methods from the data science and ethnographic fields are mixed to solve perennial customer service problems within the call center and cruise industries. The paper details this particular blend of ethnographic practitioners with a data scientist resulted in changes to design approaches, debunking myths about qualitative and quantitative research methods being at odds and altering team member perspectives about the value of both. The project also led to the creation of innovative blended design research and data science methods to discover and leverage the right customer data to the benefit of both the customer and the call center agents who serve them. This paper offers insight into the untold value design teams can unlock when data scientists and ethnographers work together to solve a problem. The result was a design solution that gives a top-performing company an edge to grow even better by leveraging the millions...

Designing for Interactions with Automated Vehicles: Ethnography at the Boundary of Quantitative-Data-Driven Disciplines

MARKUS ROTHMÜLLER School of Architecture, Design and Planning, Aalborg University Copenhagen, Denmark and Shift Insights & Innovation Consulting PERNILLE HOLM RASMUSSEN School of Architecture, Design and Planning, Aalborg University Copenhagen, Denmark SIGNE ALEXANDRA VENDELBO-LARSEN School of Architecture, Design and Planning, Aalborg University Copenhagen, Denmark Case Study—This case study presents ethnographic work in the midst of two fields of technological innovation: automated vehicles (AV) and virtual reality (VR). It showcases the work of three MSc. Techno-Anthropology students and their collaboration with the EU H2020 project ‘interACT’, sharing the goal to develop external human-machine interfaces (e-HMI) for AVs to cooperate with human road users in urban traffic in the future. The authors reflect on their collaboration with human factor researchers, data scientists, engineers, experimental researchers, VR-developers and HMI-designers, and on experienced challenges between the paradigms of qualitative and quantitative...

Human-Centered Data Science: A New Paradigm for Industrial IoT

MATTHEW YAPCHAIAN Uptake Few professions appear more at odds, at least on the surface, than ethnography and data science. The first deals in qualitative “truths,” gleaned by human researchers, based on careful, deep observation of only a small number of human subjects, typically. The latter deals in quantitative “truths,” mined through computer-executed algorithms, based on vast swaths of anonymous data points. To the ethnographer, “truth” involves an understanding of how and why things are truly the way they are. To the data scientist, “truth” is more about designing algorithms that make guesses that are empirically correct a good portion of the time. Data science driven products, like those that Uptake builds, are most powerful and functional when they leverage the core strengths of both data science and ethnographic insights: what we call Human-Centered Data Science. I will argue that data science, including the collection and manipulation of data, is a practice that is in many ways as human-centered and subjective...

Human Sensemaking in the Smart City: A Research Approach Merging Big and Thick Data

ANNELIEN SMETS imec-SMIT, Vrije Universiteit Brussel BRAM LIEVENS imec-SMIT, Vrije Universiteit Brussel This paper aims to contribute to the debate on the integration of ethnography and data science by providing a concrete research tool to deploy this integration. We start from our own experiences with user research in a data-rich environment, the smart city, and work towards a research tool that leverages ethnographic praxis with data science opportunities. We discuss the different key components of the system, how they work together and how they allow for human sensemaking....

Contextual Analytics: Towards a Practical Integration of Human and Data Science Approaches in the Development of Algorithms

MILLIE P. ARORA MIKKEL KRENCHEL JACOB MCAULIFFE ReD Associates POORNIMA RAMASWAMY Cognizant As algorithms play an increasingly important role in the lives of people and corporations, finding more effective, ethical, and empathetic ways of developing them has become an industry imperative. Ethnography, and the contextual understanding derived from it, has the potential to fundamentally change the way that data science is done. Reciprocally, engaging with data science can help ethnographers focus their efforts, build stronger and more precise insights, and ultimately have greater impact once their work is incorporated into the algorithms that increasingly power our society. In practice, building contextually-informed algorithms requires collaboration between human science and data science teams who are willing to extend their frame of reference beyond their core skill areas. This paper aims to first address the features of ethnography and data science that make collaboration between the two more valuable than the sum of their respective...

Who and What Drives Algorithm Development: Ethnographic Study of AI Start-up Organizational Formation

RODNEY SAPPINGTON Founder, CEO, Acesio Inc. LAIMA SERKSNYTE Head of Behavioral and Organizational Research, Acesio Inc. The focus of this paper is to investigate deep learning algorithm development in an early stage start-up in which edges of knowledge formation and organizational formation were unsettled and contested. We use a debate by anthropologists Clifford Geertz and Claude Levi-Strauss to examine these contested computational forms of knowledge through a contemporary lens. We set out to explore these epistemological edges as they shift over time and as they have real practical implications in how expertise and people are valued as useful or non-useful, integrated or rejected by the practice of deep learning algorithm R&D. We discuss the nuances of epistemic silences and acknowledgments of domain knowledge and universalizing machine learning knowledge in an organization that was rapidly attempting to develop algorithms for diagnostic insights. We conclude with reflections on how an AI-Inflected Ethnography perspective...

ReHumanizing Hospital Satisfaction Data: Text Analysis, the Lifeworld, and Contesting Stakeholders’ Beliefs in Evidence

JULIA WIGNALL Seattle Children's Hospital DWIGHT BARRY Seattle Children's Hospital Case Study—Declining clinician engagement, increasing rates of burnout, and stagnant patient and family experience scores have led hospital leadership at Seattle Children's Hospital to submit requests to a data scientist and an anthropologist to identify key themes of survey comments and provide recommendations to improve experience and satisfaction. This study explored ways of understanding satisfaction as well as analytic approaches to textual data, and found that various modes of evidence, while seemingly ideal to leaders, are hard pressed to meet their expectations. Examining satisfaction survey comments via text mining, content analysis, and ethnographic investigation uncovered several specific challenges to stakeholder requests for actionable insights. Despite its hype, text mining struggled to identify actionable themes, accurate sentiment, or group distinctions that are readily identified by both content analysis and end users, while more...

Acting on Analytics: Accuracy, Precision, Interpretation, and Performativity

JEANETTE BLOMBERG IBM Research ALY MEGAHED IBM Research RAY STRONG IBM Research Case Study—We report on a two-year project focused on the design and development of data analytics to support the cloud services division of a global IT company. While the business press proclaims the potential for enterprise analytics to transform organizations and make them ‘smarter’ and more efficient, little has been written about the actual practices involved in turning data into ‘actionable’ insights. We describe our experiences doing data analytics within a large global enterprise and reflect on the practices of acquiring and cleansing data, developing analytic tools and choosing appropriate algorithms, aligning analytics with the demands of the work and constraints on organizational actors, and embedding new analytic tools within the enterprise. The project we report on was initiated by three researchers; a mathematician, an operations researcher, and an anthropologist well-versed in practice-based technology design, in collaboration...

How Modes of Myth-Making Affect the Particulars of DS/ML Adoption in Industry

EMANUEL MOSS CUNY Graduate Center / Data & Society FRIEDERIKE SCHÜÜR Cloudera Fast Forward Labs The successes of technology companies that rely on data to drive their business hints at the potential of data science and machine learning (DS/ML) to reshape the corporate world. However, despite the headway made by a few notable titans (e.g., Google, Amazon, Apple) and upstarts, the advances that are advertised around DS/ML have yet to be realized on a broader basis. The authors examine the tension between the spectacular image of DS/ML and the realities of applying the latest DS/ML techniques to solve industry problems. The authors discern two distinct ways, or modes, of thinking about DS/ML woven into current marketing and hype. One mode focuses on the spectacular capabilities of DS/ML. It expresses itself through one-off, easy-to-grasp marketable projects, such as DeepMind’s AlphaGo (Zero). The other mode focuses on DS/ML’s potential to transform industry. Hampered by an emphasis on tremendous but as of yet unrealized...

The Stakes of Uncertainty: Developing and Integrating Machine Learning in Clinical Care

MADELEINE CLARE ELISH Data & Society Research Institute The wide-spread deployment of machine learning tools within healthcare is on the horizon. However, the hype around “AI” tends to divert attention toward the spectacular, and away from the more mundane and ground-level aspects of new technologies that shape technological adoption and integration. This paper examines the development of a machine learning-driven sepsis risk detection tool in a hospital Emergency Department in order to interrogate the contingent and deeply contextual ways in which AI technologies are likely be adopted in healthcare. In particular, the paper bring into focus the epistemological implications of introducing a machine learning-driven tool into a clinical setting by analyzing shifting categories of trust, evidence, and authority. The paper further explores the conditions of certainty in the disciplinary contexts of data science and ethnography, and offers a potential reframing of the work of doing data science and machine learning as “computational...

Humans Can Be Cranky and Data Is Naive: Using Subjective Evidence to Drive Automated Decisions at Airbnb

STEPHANIE CARTER Airbnb RICHARD DEAR Airbnb Case Study—How can we build fairness into automated systems, and what evidence is needed to do so? Recently, Airbnb grappled with this question to brainstorm ways to re-envision the way hosts review guests who stay with them. Reviews are key to how Airbnb builds trust between strangers. In 2018 we started to think about new ways to leverage host reviews for decision making at scale, such as identifying exceptional guests for a potential loyalty program or notifying guests that need to be warned about poor behavior. The challenge is that the evidence available to use for automated decisions, star ratings and reviews left by hosts, are inherently subjective and sensitive to the cross-cultural contexts in which they were created. This case study explores how the collaboration between research and data science revealed that the underlying constraint for Airbnb to leverage subjective evidence is a fundamental difference between ‘public’ and ‘private’ feedback. The outcome of this integrated,...

Below the Surface of the Data Lake: An Ethnographic Case Study on the Detrimental Effect of Big Data Path Dependency at a Theme Park

JACOB WACHMANN ReD Associates ANDREAS JUNI ReD Associates DAVE BAIOCCHI ReD Associates WILLIAM WELSER IV ReD Associates Case Study—This case-study details how a team of anthropologists and a team of data scientists sought to help a Middle Eastern theme park make use of their big data platform to measure ‘the good customer experience’. Ethnographic research within the theme park revealed that visitors yearned to bond with the other members of their group, as they rarely got the chance during their busy everyday lives back home. However, trying to build a measurement of how the theme park delivered on bonding – through the development of a ‘bonding index’ – turned out to be unfeasible, because the big data platform focused on capturing operational data. The decision to focus on operational data had unintentionally created a path dependency that made the big data setup unfit for answering some of the theme park’s most fundamental questions. This is a problem ReD Associates has observed across clients and to solve it this...

Revitalising Openness at Mozilla: A Mixed Method Research Approach

RINA TAMBO JENSEN Mozilla Case Study—This is a case about how Mozilla, the open source browser company, set out to reconnect with ‘collaborating in the open’ to regain its competitive advantage. This case describes how a multi-disciplinary research team used ethnographic, market, and data analysis to articulate and clarify the problem, and build a strategy towards revitalizing Openness at Mozilla. It will aim to prove that the subsequent change achieved could only have been accomplished by a mixed method research approach. And importantly show, how the team used data to prove the distribution of findings, coupled with ethnography to shine light on the why and how of those findings. The case study will do this by discussing the key insights and how these fueled recommendation and subsequent change in the organisation. The project presented many problems: from convincing stakeholders of the need to fully explore the problem, to connecting widely different research methods and gleaning insights that built strongly on all strands...