Advancing the Value of Ethnography

Video Utterances: Expressing and Sustaining Ethnographic Meaning through the Product Development Process


Download PDF


In this paper we discuss the use of short, specific videos to communicate ethnographic data throughout the product development process. Ethnographic videos of this nature provide complex information in short “utterances” (zero to three minutes) that researchers use to effectively convey local meaning to other participants in the process. Video utterances can be used to create opportunities for participation in product ideation, recognize key features and identify problems during product testing. With proper scaffolding, the video utterances are an effective means of contextual representation proving to be quick, direct and influential with product development teams. Using video of this kind impacts the product as the local, ethnographic meaning is sustained throughout development.


Ethnographic practice is concerned with the understanding of local meaning juxtaposed against one or more theoretical constructs. In an industrial setting, ethnographic practice is often concerned with the understanding of local meaning against the theoretical construct of corporations. However, in a corporate setting, when the output is not a theoretical discussion, but rather a product or service, ethnographic practice must also consider the communication and maintenance of meaning within the corporate construct and with the entire product development team throughout the product development cycle. In theory, this “maintenance work” appears straightforward; however, in practice, this work is deceptively difficult due to the collective bias of homogenous “in-groups.” When bias and self interest align with local meaning, the problem is much smaller and therefore, less salient. However, when bias and self-interest are at odds with or do not comprehend the local meaning of target markets, a benign, unintentional dissociation occurs and the meaning intended to be imbued in the product or service becomes lost. The result can be a product that misses its mark, and essentially wastes time, resources, effort and wastes the original ethnographic work that catalyzed quality product development at the start.

The propensity for dissociation of meaning occurs when a corporation has capabilities that could be of service to a group of people, but yet, the corporation is not comprised of that constituency. One example, used as the illustrative vehicle for this paper, is the design of computers for school children. Consider the following: Computing, in the general sense, can have meritorious benefits to children’s individual and collective educational experience and yet it is not widely prevalent in schools worldwide (especially in emerging markets). It is in large part because computers have not been designed forschools and therefore the technology offerings do not add value in classroom settings. Appropriate design for schools requires a deep and local understanding of what computing might mean in classrooms. However, groups of engineers and designers cannot collectively (or even for the most part, individually) understand this meaning on an experiential level for schools all around the world. Yet a few ethnographers can, due to the nature of their practice, training and skills. The question, then, is how do ethnographers convey, represent and sustain the local meanings of education as it is transmogrified in the design process? Ideally they will be able to prevent the dissociation of meaning referred to above.

We have been experimenting with short, differential uses of video-based data that makes explicit the meaning of classroom practice as it is now or as it might be imagined with the new product. These “video utterances” attempt to make the local meaning – either direct or transformed – visible, significant and persistent with different constituencies at different times in the development process. These video utterances, coupled with oral and written accounts, create, convey and sustain ethnographic-based meaning with teams in a geographically distributed work unit at a large corporation.


Video is a natural choice for researchers to convey meaning to stakeholders in development. Video has been found to portray rich and accurate accounts of the field, inspire a diversity of perspectives, give real and concrete examples of users and capture testing sessions and focus groups (Buur, Binder and Brant 2000; Brun-Cottan and Wall 1995). Commonly, video is either a document of primary data (usually in long unedited segments) or edited pieces that catalyze the dialogue between research, design and engineering (usually through longer documentaries or illustrative clips) (Faulkner 2007; Raijmakers, Gaver and Bishay 2006).

As video sound bites or “YouTube”-style shorts become more pervasive in daily life, video of this kind is also becoming increasingly prevalent in industry. Sunderland and Denny note that video “Ethno-bites” are in high demand due to their dramatic impact and quick implications. The attraction to using “movable, extractable video clips” may be at the expense of losing analytical frameworks and the ethnographic context. However, videos of any length, and ethnographic texts in general, are subject to the “uncertainty and instability of the interpretive meaning that the viewer bring to viewing” (Sunderland and Denny 2007: 266).

In a general sense, the successful construction of meaning from field videos is “dependent on the participation of actors, recorders, editors and viewers” (Buur, Binder and Brant 2000: 1). A number of works have shown that video used in industry is highly interpretive, due in part to the richness of the data it contains. Thus, although it is a very powerful tool for representation, video presentation requires diligence to ensure that a development team is able to assess the right meanings, in a efficient time frame. In co-viewing and engaging in dialogue, stakeholders can generate meanings together that are balanced as well as novel (Buur and Soendergaard 2000). Teams discussing video must be considerate to not create too many meanings, or ones that miss the mark. The researcher can also place the data in proper context and edit down complex raw footage to align with the time constraints of busy industry teams. The research must be vigilant of the representational burden of what to present and what to not present, and which edits will carry the themes from the research (Faulkner 2007).

As researchers, we have experienced the dangers of data losing its contextual meaning and confounding the research findings. However in our experience, when we use longer videos that retain more context, it works against the goals of quick-paced product development. Long pieces demand time that developers may not have and can distract when time sensitive decisions need answers. As ethnographers, we want to retain the ethnographic texture, even when we are a part of a larger production schedule. We employ the short, contained, often annotated, clips to make convincing design arguments with concrete examples from the field. The short, pithy format, packed with punch and just a little pizzazz, can effectively convey layers of information superimposed in time, and the contextual relief for the data gathered.

Our main objective with video is to show an interpretation of user experience, and use utterances to provide compelling evidence as key product decisions are made. In Latour’s work on visualization and cognition, he considers that the importance of visual representation is the effectiveness it has in argumentation. One who cannot represent through visualizations “loses the encounter. His [sic] fact does not hold” (1990: 16). In industry, it is vital that the facts from the field become relevant and meaningful for the stakeholders and decision-makers. The user experience must be sustained as evidence in debates over product definition- otherwise it loses, to financial, legal, engineering or marketing pressures.

Videos have many advantageous qualities for argumentation. They are mobile, but also preserve their character across locations, platforms and browsers. Videos can also be flexibly modified in scale, are cheap to reproduce and spread (using intranets and the Internet). They can be integrated with other kinds of ethnographic texts and industry reports (Pea, Lindgren and Rosen 2006). Most importantly, for argumentation among disparate stakeholders, video can move quickly among groups, languages and geographies.

We do not rely solely on field video to make compelling arguments, but also on videos of reenactments and acting. Learning from performative ethnography (also known as Informances (Burns 1994) or Focus Troupes (Salvador and Howells 1998)), we use video to capture the recreations of ethnographic data. Informances have been used to explore design issues and generate new knowledge, allowing teams to go beyond simply calling upon research data to make decisions (Burns 1994; Nencel and Pels 1991). Our videos often contain elements of reenactments and dramatizations of meanings and (re)interpretations of designs based on the original ethnographic work. Thus, video has been able to capture the field, but it has also been able to record our many interpretations of meaning.

As video utterances are shared throughout the development group, they can make an evocative and convincing argument in a short time with minimum misinterpretation. We are finding that the video utterance, for both its argumentative and representational elements, can be the most effective ethnographic “text” in product development.


Raw video

Early in the development process, it is important for a work group to gain a deeply felt understanding of the audience for whom they are designing. As mentioned earlier in our education example, it is not feasible for everyone to visit schools. But it is feasible for the ethnographer to bring “the field”, with associated meaning and context (albeit more limited), to the engineer.

In our research, we found that classrooms are dynamic places. To illustrate some key values to the engineers, like noise level, we cut small clips of raw footage from classrooms. We then instructed the engineers to use a strategic eye to look at bodily movement and the volume level during a lesson. Within one handwriting lesson, students from China exhibit a binary contrast: when called upon by the teacher, together the class gave loud, animated responses, but at all other times they bodies were still and silent. Compared to the United States, in more or less the same handwriting lesson, the students moved about in their chairs and classroom had a consistent myriad of voices.


Classroom Noise: These two classrooms appear to be very similar. The students in China (left) and U.S. (right) are both practicing handwriting in the air before they use pencils and paper. However, when playing these two videos together, the developers take note of differences in volume and student self-discipline between the two classrooms.

In these clips, we wanted to illustrate just one element of the culture of schools worldwide. The videos conveyed important animation and audio that still pictures could not. Additionally, these one minute videos were more descriptive and more complete than textual narratives. Just telling this story takes time, and lacks an immersion in the dynamic feel of the classroom.

In showing this clip, viewers might misinterpret one important aspect of students in China, and could label them quieter and more disciplined than other students. Here, our videos needed additional scaffolding. The “silent clip” from China was complemented by research data of these students outside class. While the Chinese students were quiet and focused during lessons, the moment the class was over they begun vibrant conversations. Outside of class, but still at school, students were extremely social and interactive with their peers.

The development team was able to conjecture that the laptop needed to serve in distinct but important situations: first, it should serve as a classroom learning device performing in highly structured class time, and second, it should serve as a social device also supporting exploration and collaboration with other students. Some physical features on the laptop needed to effectively support these activities. In a silent classroom, whether a typical Chinese classroom or a test taking situation in the US, thirty laptop machines creating the inherent noise that a computer makes will overwhelm both the room and the single voice of the teacher. This video changed the perception of laptop noise and enabled design around quieter devices.

Noise level in the classroom is just one of many key observations that were able to come about because of the raw video footage obtained from classrooms. A readily noticeable activity in classrooms with laptops is the process of charging and storing the devices in a cabinet. To convey the need for the laptops to accommodate frequent storage and retrieval, we used video clips with the engineering teams. Here, pictures or narratives would not have conveyed the real time process of moving the machines to the cart and plugging each of them in individually. Moreover, we were able to simultaneously showcase the burden on the teacher and the misuse of class time by other students who were not involved in the lengthy process. It is interesting to note that we often use video to illustrate how much effort users put into tasks that should be easy to do. When we show these clips, the viewers become exasperated with the length of time that the process is taking, often much sooner than the teacher or student will.


Returning the laptops to the cart: Students in the classroom are left to their own devices while a few students and the teacher attempt to plug in and store each laptop.

Engineers were able to see that this process created new, transitional moments that add undesirable inefficiencies to an otherwise tightly scripted classroom environment. In order to support this frequent practice, certain features needed to be considered and designed appropriately. The integrated handle allowed students to carry the laptops effectively, while the location of the charging port, the length of the cord and the charging light indicator also came under review. This video was able to make salient the nuances of this common practice. The solution thus required slight individual design solutions, which together avoid the introduction of needless inefficiency in the classroom.


As the development process proceeds, decisions move from strategies and concepts to execution and details. The team must balance the demands of local meaning against dynamically unfolding business and engineering realities. Discussions about specific features and capabilities can be influenced by the meaning found in video utterances because they are comprised of the field research data. We have been able to use video to create new utterances that shape and convey the intended experiences for the product. Video from the field is purposed and edited, and sometimes recreated, to convey new meaning in a relevant context.

Research introducing technology prototypes yielded key user experience issues with the technology planned to support handwriting and touch usages. While we tested the technology with many students and teachers, the summary of data was not convincing until simple video presentation made the same points. In the first video, we juxtaposed two handwriting experiences: one, where the student struggled to write legibly, and the other, where the teacher was allowed to rest the side of her hand on the screen. The short video led the team to agree that a quality screen must recognize the pen tip but not the side of the hand. In the second video, we illustrated what the experience should be like, based on the positive experiences that we saw in the field. The videos permitted management and the engineering teams to immediately understand the proposed context of use and build a personal experience with touch screen usage which they could draw on in development. The videos secured handwriting technology as a key feature, leading the engineering and research team to work in partnership to evaluate several potential solutions.


Handwriting Utterances

Left: Video compares the quality of real-life handwriting experiences.

Right: Researcher demonstrates palm cancellation.

Another important feature of our product is designing for the inherent roughness of young student use. Surviving a drop off a desk or a toss on the floor in a backpack are values that were uncovered during research. The engineering standards for product drop testing were at odds with what it meant to be rugged in the classroom. An industry drop test outlines a unit falling from specific distances, at various angles, against specific flooring. The industry standard test is necessary, but from a user experience perspective, we needed additional level of durability tied to enabling new usages letting kids feel free to behave the way they do with other objects.

While in the field, researchers were only able to capture limited instances of rough use on video, but interviews with teachers, administrators and parents revealed instances and values associated with the importance of rugged in the classroom. The meaning that was generated from our field work needed to be captured and conveyed to the other team members, in a way that was as dramatic as the stories from the field. Since actual drops are rare but could be catastrophic, plausible reenactments based on stories from field research were helpful in communication. In this case, we use both researcher reenactments and student reenactments to show the laptop falling off chairs, flung out of hands, swung around in circles and catapulted off binders. Showing plausible situations in the field made a noticeable change in the value the development team placed on ruggedness, making the case that the device must exceed industry standards for notebook PCs.


Dropping Performances

Left: Researchers drop the laptop from adult height

Right: Elementary school student reenacts dropping senarios

Text, Arrows, Circles, and Other Annotations

In the handwriting example above, you may notice the use of arrows, dual pictures and text to help illustrate the experience. Earlier in this paper we also referred to the burden placed on the researcher when deciding which edits represent the themes of the research. In some ways the multitude of editing choices can complicate the meaning of a video from the field, but for shorter pieces some editing techniques can provide important scaffolding. In our practice, we use small instances of video, edited with text overlays, arrows and other identifiers to indicate issues in the contextual relief that may be lost when cutting down the footage.

In one specific instance, we used a video to demonstrate the aspects of poor wireless networking in the classroom. In the first video, students had trouble accessing the wireless network. It showed students moving closer to the wireless access point. Thus, the students raised the computers over their heads, or walked all the way across the room to the wireless access point. This video showed how clearly undesirable of a situation this was during the middle of a class. Arrows helped to identify which behaviors to look to, while text helped to explain what the students were doing and why. Clearly, the development team could see that device that was meant for learning quickly became a vehicle for distraction.

Our second video highlights the networking problems that occurred when all students tried to print documents at once. Most printers built for office settings spool interspersed printing jobs. In a classroom, it is common to find all students printing at the end of a lesson. The printer stumbles over the many simultaneous requests and, in this case, stops working at a single request. This video follows the teacher as he tries to trace down the machine that is responsible for the printer jam. Watching this video in its entirety makes a substantial impact on the viewer, who can feel the frustration grow over fifteen minutes of real time. But, we know that this clip was much to long to engage the engineers. To illustrate the elapsed time, we used text panels, and considered superimposing the time code. These panels explained actions that might have otherwise been obscured by the time lapse where the video was not shown. After these were included, the time code would have taken extra effort when we had already produced the intended frustrating effect for viewing. Therefore it was left out.


Wireless Problem :Students try to connect to the wireless signal by moving their computers directly in front of the WAP.


Printer Problem: Text helps to explain the classroom activities when the networked printer overloads. The viewer experiences minutes of wasted class time on technical problems.

It was not only the poor performance, but the exact nature of the poor performance that mattered to the engineers. As the development process enters the testing stage, researchers can be present to see and understand problems and challenges in situ, but again, the engineers often do not have this luxury. Framed video helps to bring simple and poignant evidence of a problem to the engineering teams. To get the video out of inboxes, and the problem prioritized, we take into account that engineers often do not have time to consider videos in their daily work. We found that shorter videos and simple annotations increase the number of viewers and the impact of our research. Cases like this qualified the strategic focus around networking solutions. The videos encouraged a hardware and software shift for the following device iterations and the most current networking technology, such as 802.11n wireless LAN, was considered.

We have seen evidence that the meaning made in the field persists when it is captured by video. It is common to hear designers and engineers refer back to their experience with a topic (even in abstraction) that was captured on video and make locallyrelevant decisions. While the video is just a representation, the context and conversation around the video make it effective in embodying meaning, and also translating that meaning from the school to the corporation.


As we with short video, we consider a questioned posed by Sunderland and Denny: “Will those videos that ‘close down debate’ become the standard over those longer pieces that ‘maintain the animation, dynamics of the lived experience’?” (2007: 268).

With a demanding product development schedule, the simple answer is yes. We hope that ethnographic video would simultaneously exist in longer formats that retain an ethnographic prose meant for inspiration and interpretation, at the same time that video utterances provide clear implications. The reason for this is simple: in the office the user experience can get lost along the product road map. Using video at every stage in development can be an incredibly powerful way to contextualize human behavior in the minds of the engineers, designers and management. In doing this, we realized that to maintain the ethnographic impact, the videos become shorter stories.

The nature of ethnographic work can influence decisions about strategies and specific product features. However, research exists as only one of many factors that influence decisions. Often in our work, hardware, software, marketing, legal, finance, design, and research gather together to define the product and each sway design in favor of particular high priority issues. Finance pushes for price; legal steers clear from litigation; and hardware and software ground us in the realities of available technology. In these debates, we have used our video utterances to bring the field to life and put the user in the forefront of the stakeholder’s minds. Powerful, short videos can help create an appropriate balance between conflicting demands. The clips are constant bridges back to the original intentions behind the product strategy. At many critical stages, video has helped ‘win the room’ of stakeholders and make critical product decisions with the user experience in mind.


We have used short video utterances to create new possibilities in our products, identify key features and capabilities, and illustrate issues found during research. By using video, instead of relying only on pictures and words, we found that the communication of key issues was clearer, arguments were settled more quickly, and team members were able to agree on what the local meaning meant for the product. Most importantly, by using video with appropriate scaffolding, it kept the ethnographic work relevant at many stages of the product development, and for different stakeholders. In our past work, as in other studies, the communication of local meaning was restricted to ideation sessions at the beginning of the development, or to document testing at the end. Now, using video utterances, we were able to consistently address the issues as engineers or designers encountered them throughout the process. An utterance can convey local meaning in an appropriate and consumable way because of its short and self-contained nature.

Cataloguing and preparing video for this purpose takes time and resources. Therefore, we do not convert all the video from the field into a database of utterances. We cannot always predict what issues will be hotly debated, or, what nuances from the field will be most important. Instead, we closely follow product team development and extract video from which relevant discussion can emerge, opportunities for new features are exposed, or patterns of behaviors can be addressed. We contain video in short clips, edit as needed, and discuss with appropriate members of the team. We found that by putting time into creating meaningful video clips, the development team ultimately saves time in making better decision earlier and not having to revisit decisions as often. It has been our experience that team members communicate the meaning held in the video, and, in the end, develop a product that is better suited for the intended value propositions.

We contend that in corporate settings, video is fast becoming the “new text,” a mechanism for creating, conveying and sustaining meaning across constituencies and over time. While there remain significant methodological and practical issues around the time and effort required to produce and (re)edit video, we’re finding the tradeoff for communication and contextualization to be very promising.


In our future work, we hope to increase our use of video utterances and further document the impact it has on our team and the products. To increase our use we will be looking at new ways to efficiently create and disseminate video. With the popularity and ease of YouTube, we see a great potential for video utterances to be distributed in a similar fashion over local intranets. We will be looking into best practices for encouraging independent consumption of video material. It is our hope that we will be able to make video available to team members so that they can view, discuss, and refer to that local meaning with ease.

Meg Cramer is a user experience researcher at Intel, specializing in uses of video in research. She has training in sociology and radio/television/film studies from Northwestern University.

Mayank Sharma works in Intel’s People and Practices Research Group.

Tony Salvador directs product definition and development research for the Emerging Markets Platforms Group (EMPG) at Intel. Before being demoted to management, he was a Research Scientist and co-founder of Intel’s People & Practices Group.

Russell Beauregard is currently leading a user experience quality assessment program at Intel. He received his M.S. and Ph.D. degrees in Human Factors and Industrial Psychology from Wright State University, where he also taught research methods courses.


Brun-Cottan, F. and Wall, P.
1995 Using video to re-present the user. Communications of the ACM, 38(5): 61-71.

Burns, C., Dishman, E., Verplank, W., and Lassiter, B.
1994 Actors, Hairdos & Videotape – Informance Design. CHI’94 Conference Companion, ACM. Boston.

Buur, J., and Soendergaard, A.
2000 Video card game: an augmented environment for user centred design discussions. In Proceedings of DARE 2000 on Designing augmented reality environments, Elsinore, Denmark, 63-69.

Buur, J., Binder, T., and Brandt, E.
2000 Taking video beyond “hard data” in user centred design. Proceedings of PDC CPSR, New York.

Faulker, Susan
2007 Real Reality TV: Using Documentary-Style Video to Place Real People at Center of the Design Process. Intel Technology Journal: Designing Technology with People in Mind. 11-21.

Latour, B.
1990 Drawing things together. Representation in Scientific Practice, MIT Press.

Nencel, Lorraine and Pels, Peter (editors)
1991 Constructing Knowledge: Authority and Critique in Social Science.London: Sage.

Pea, R., Lindgren, R., and Rosen, J.
2006. Computer-Supported Collaborative Video Analysis. Proceedings of the 7th International Conference on Learning Sciences. Indiana: International Society of Learning Sciences.

Rajimakers,B., Gaver, W.W., and Bishay, Jon.
2006. Designing Documentaries: Inspiring Design Research through Documentary Films. Proceedings of DIS 2006, University Park, PA: ACM Press.

Salvador, T. & Howells, K.
1998 Focus Troupe: Using drama to create common context for new product concept end-user evaluations. Proceedings ofCHI ‘98 Conference on Computer Human Interaction.

Sunderland, P.L., and Denny, R.M.
2007 Doing Anthropology in Consumer Research. Left Coast Press, Walnut Creek.