When researchers asked hundreds of people to watch other people shake boxes, it took just seconds for almost all of them to figure out what the shaking was for.
The deceptively simple work by Johns Hopkins University perception researchers is the first to demonstrate that people can tell what others are trying to learn just by watching their actions. Published today in the journal Proceedings of the National Academy of Sciences, the study reveals a key yet neglected aspect of human cognition, and one with implications for artificial intelligence.
“Just by looking at how someone’s body is moving, you can tell what they are trying to learn about their environment,” said author Chaz Firestone, an assistant professor of psychological and brain sciences who investigates how vision and thought interact. “We do this all the time, but there has been very little research on it.”
Recognizing another person’s actions is something we do every day, whether it’s guessing which way someone is headed or figuring out what object they’re reaching for. These are known as “pragmatic actions.” Numerous studies have shown people can quickly and accurately identify these actions just by watching them. The new Johns Hopkins work investigates a different kind of behavior: “epistemic actions,” which are performed when someone is trying to learn something.
For instance, someone might put their foot in a swimming pool because they’re going for a swim or they might put their foot in a pool to test the water. Though the actions are similar, there are differences and the Johns Hopkins team surmised observers would be able to detect another person’s “epistemic goals” just by watching them.
Across several experiments, researchers asked a total of 500 participants to watch two videos in which someone picks up a box full of objects and shakes it around. One shows someone shaking a box to figure out the number of objects inside it. The other shows someone shaking a box to figure out the shape of the objects inside. Almost every participant knew who was shaking for the number and who was shaking for shape.
“What is surprising to me is how intuitive this is,” said lead author Sholei Croom, a Johns Hopkins graduate student. “People really can suss out what others are trying to figure out, which shows how we can make these judgments even though what we’re looking at is very noisy and changes from person to person.”
Added Firestone, “When you think about all the mental calculations someone must make to understand what someone else is trying to learn, it’s a remarkably complicated process. But our findings show it’s something people do easily.”
The findings could also inform the development of artificial intelligence systems designed to interact with humans. A commercial robot assistant, for example, that can look at a customer and guess what they’re looking for.
“It’s one thing to know where someone is headed or what product they are reaching for,” Firestone said. “But it’s another thing to infer whether someone is lost or what kind of information they are seeking.”
In the future the team would like to pursue whether people can observe someone’s epistemic intent versus their pragmatic intent — what are they up to when they dip their foot in the pool. They’re also interested in when these observational skills emerge in human development and if it’s possible to build computational models to detail exactly how observed physical actions reveal epistemic intent.
The Johns Hopkins team also included Hanbei Zhou, a sophomore studying neuroscience.