Videos

Human Activity Computing from Inside-out and Outside-in Visual Data

Presenter
April 10, 2018
Abstract
We are witnessing a revolution of artificial intelligence that deeply permeates our lives in aid of large-scale data. However, such AI systems still cannot observe and process underlying mental states such as intent, emotion, and attention while nearly any three-year-old can effortlessly read the meaning of a simple nod, eye contact, or a pointed finger. What makes the three-year-old—and the rest of us—so different from these AI systems? My conjecture is the ability to discern microscopic behavioral signals is key, and in this talk, I will present two ways to measure such human activities: leveraging first-person and third-person visual data. Human perception is best captured in first-person videos as naturally following the visual attention of the wearers. This provides a detailed description of physical and social interactions with surrounding objects, people, and scenes. I will show that it is possible to uncover the underlying states that govern the interactions, e.g., control force and joint attention. Human body signals are better measured by third-person videos that convey the global context of the interactions, e.g., spatial relation of objects with face, body, and finger movements. My team has been developing a computational model to reconstruct human activities in 3D at unprecedented resolution by leveraging a large number of third-person videos. To the end, I will argue that these two visual measurements can be complementary, which will produce a powerful tool to analyze human behaviors. Hyun Soo Park is an Assistant Professor at the Department of Computer Science and Engineering, the University of Minnesota. He is interested in understanding human visual sensorimotor behaviors from visual data. Prior to the UMN, he was a Postdoctoral Fellow in GRASP Lab at University of Pennsylvania. He earned his Ph.D. from Carnegie Mellon University.