Towards Scaling Video Understanding: The quantity of video data is vast, yet our capabilities for visual recognition and understanding in videos lags significantly behind that for images. In this talk, I will first discuss some of the challenges of scale in labeling, modeling, and inference behind this gap. I will then present some of our recent work towards addressing these challenges, in particular using reinforcement learning-based formulations to tackle efficient inference in videos and learning classifiers from noisy web search results. Finally, I will conclude with discussion on future promising directions towards scaling video understanding.
Session Summary
Towards Scaling Video Understanding
MLconf 2017 Seattle
Serena Yeung
Stanford
PhD Student
Learn more »