Deep learning for structured image understanding

Dr. Jonathan Huang, (Google Al)
Resumen: Over the last few years, we've made huge leaps in our ability to automatically understand images in a structured way. I will discuss some of these recent advances specifically in the context of core structured computer vision tasks including object detection, segmentation and pose estimation. Part I is an introductory tutorial to modern deep learning methods primarily focusing on object detection, but also touching on semantic and instance segmentation as well as pose estimation. I will assume a basic familiarity with neural networks and cover recent convolutional architectures, loss functions, evaluation metrics, as well as practical tips and tricks for making these models "work" well. In Part II, I will give a tour of recent research projects that push the envelope in this field (mostly by myself and colleagues at Google). Topics will include (1) efficient inference, (2) weakly supervised learning, and (3) temporal models for video.