Advancing the Value of Ethnography

You too can collect big data!


EPIC2014 Workshop by Anna Avrekh, Kathy Baxter, & Bob Evans

At the EPIC 2013 Keynote, Tricia Wang observed that, if you are not working with “Big Data,” the implication is that your data are “small.” Although the number of data points or participants may not be in the millions or ever thousands, the data we gather is actually far richer. As our community knows, web analytics or logs can tell us WHAT people are doing but never WHY. We may attempt to infer it based on what we see but unless we ask our users why they are doing something that we have recorded (with or without their knowledge), we can never know for sure.

Later in the conference, I hosted a Salon on “Big Data” with discussants Jens Riegelsberger (Google) and Todd Cherkasky (SapientNitro). The interest in the salon far exceeded the space available. One key theme that emerged was a desire to learn how to incorporate “Big Data” into their work. Few of the participants had the means to pull logs and do deep statistical analysis on them. This makes it extremely difficult to pair up the WHAT others are collecting with the WHY they are observing. I realized the community might be interested in a methodology we have been using the last three years at Google called “Experience Sampling Methodology” (ESM), which combines the best of both worlds in a scalable manner. With the help of a mobile app called the “Personal Analytics Companion” (PACO) created by Bob Evans, we have been able to conduct large scale ESM studies that have the richness of diary studies, frequency of measurement and context in field studies, and scale of small online experiments.

In our workshop, we discussed the background of ESM research, issues of validity, reliability, and biases, and best practices for conducting a large scale ESM study, as well as how to analyze the data. We finished with a hands-on exercise where participants built an experiment themselves.

OK, so what is an ESM study anyway? ESM asks participants at random points throughout their day about their experiences in the moment. Using a tool like PACO, participants are pinged randomly on their Android or iPhone 5-8 times per day to record what they are doing at that moment and describe their experience (e.g., satisfaction, where they are, what tools they are using). They can also share photos with us, if they feel it will help us understand what they are reporting.  If participants give us permission, we can even connect the logs from their phone with what the qualitative data they are reporting. Responding to each ping takes participants a few seconds to a couple minutes, making this a very lightweight data collection method. It is not as intrusive as following around a participant all day for several days and we know from past research that participants acclimate to the pings after only a couple days.

At the end of each day, we provide participants with a form that shows everything they reported that day, including any photos. We then ask some additional questions to further understand their experience (e.g., Did you complete your task?  What are all the ways you looked for that information today?).

By doing this for 5-7 days, we can get a detailed look into the lives of hundreds, if not thousands, of participants. In the last ESM study the Search research team conducted, we collected data from 1200 participants across 47 US states over a three-month period. We measured 186 variables resulting in 4.756 million cells! It is important to note (if it wasn’t clear already), all of this is done with the participant’s consent. They can see a dashboard of their own data and download it, which participants have told us is fascinating because it makes them aware of behaviors or patterns they hadn’t been cognizant of before.

If you’d like to try out a study for yourself as a participant, you can download PACO for free from Google Play and Apple Store. Do a search for experiments in the app and you will find many public experiments going on at any given time. If you’d like to use PACO for your research, go to PACO is Open Source so if you’re a developer, you can have a blast customizing the tool for your needs!



Kathy Baxter, Google