Much of the worlds data is in the form of visual media. In order to utilize meaningful information from multimedia and deliver innovative products, such as Google Photos, Google builds machine-learning systems that are designed to enable computer perception of visual input, in addition to pursuing image and video analysis techniques focused on image/scene reconstruction and understanding.
This week, Boston hosts the 2015 Conference on Computer Vision and Pattern Recognition (CVPR 2015), the premier annual computer vision event comprising the main CVPR conference and several co-located workshops and short courses. As a leader in computer vision research, Google will have a strong presence at CVPR 2015, with many Googlers presenting publications in addition to hosting workshops and tutorials on topics covering image/video annotation and enhancement, 3D analysis and processing, development of semantic similarity measures for visual objects, synthesis of meaningful composites for visualization/browsing of large image/video collections and more.
Learn more about some of our research in the list below (Googlers highlighted in blue). If you are attending CVPR this year, we hope youll stop by our booth and chat with our researchers about the projects and opportunities at Google that go into solving interesting problems for hundreds of millions of people. Members of the Jump team will also have a prototype of the camera on display and will be showing videos produced using the Jump system on Google Cardboard.
Tutorials:
Applied Deep Learning for Computer Vision with Torch
Koray Kavukcuoglu, Ronan Collobert, Soumith Chintala
DIY Deep Learning: a Hands-On Tutorial with Caffe
Evan Shelhamer, Jeff Donahue, Yangqing Jia, Jonathan Long, Ross Girshick
ImageNet Large Scale Visual Recognition Challenge Tutorial
Olga Russakovsky, Jonathan Krause, Karen Simonyan, Yangqing Jia, Jia Deng, Alex Berg, Fei-Fei Li
Fast Image Processing With Halide
Jonathan Ragan-Kelley, Andrew Adams, Fredo Durand
Open Source Structure-from-Motion
Matt Leotta, Sameer Agarwal, Frank Dellaert, Pierre Moulon, Vincent Rabaud
Oral Sessions:
Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection
George Papandreou, Iasonas Kokkinos, Pierre-André Savalle
Going Deeper with Convolutions
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich
DynamicFusion: Reconstruction and Tracking of Non-Rigid Scenes in Real-Time
Richard A. Newcombe, Dieter Fox, Steven M. Seitz
Show and Tell: A Neural Image Caption Generator
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, Trevor Darrell
Visual Vibrometry: Estimating Material Properties from Small Motion in Video
Abe Davis, Katherine L. Bouman, Justin G. Chen, Michael Rubinstein, Frédo Durand, William T. Freeman
Fast Bilateral-Space Stereo for Synthetic Defocus
Jonathan T. Barron, Andrew Adams, YiChang Shih, Carlos Hernández
Poster Sessions:
Learning Semantic Relationships for Better Action Retrieval in Images
Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, Charles Rosenberg, Li Fei-Fei
FaceNet: A Unified Embedding for Face Recognition and Clustering
Florian Schroff, Dmitry Kalenichenko, James Philbin
A Mixed Bag of Emotions: Model, Predict, and Transfer Emotion Distributions
Kuan-Chuan Peng, Tsuhan Chen, Amir Sadovnik, Andrew C. Gallagher
Best-Buddies Similarity for Robust Template Matching
Tali Dekel, Shaul Oron, Michael Rubinstein, Shai Avidan, William T. Freeman
Articulated Motion Discovery Using Pairs of Trajectories
Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari
Reflection Removal Using Ghosting Cues
YiChang Shih, Dilip Krishnan, Frédo Durand, William T. Freeman
P3.5P: Pose Estimation with Unknown Focal Length
Changchang Wu
MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching
Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg
Inferring 3D Layout of Building Facades from a Single Image
Jiyan Pan, Martial Hebert, Takeo Kanade
The Aperture Problem for Refractive Motion
Tianfan Xue, Hossein Mobahei, Frédo Durand, William T. Freeman
Video Magnification in Presence of Large Motions
Mohamed Elgharib, Mohamed Hefeeda, Frédo Durand, William T. Freeman
Robust Video Segment Proposals with Painless Occlusion Handling
Zhengyang Wu, Fuxin Li, Rahul Sukthankar, James M. Rehg
Ontological Supervision for Fine Grained Classification of Street View Storefronts
Yair Movshovitz-Attias, Qian Yu, Martin C. Stumpe, Vinay Shet, Sacha Arnoud, Liron Yatziv
VIP: Finding Important People in Images
Clint Solomon Mathialagan, Andrew C. Gallagher, Dhruv Batra
Fusing Subcategory Probabilities for Texture Classification
Yang Song, Weidong Cai, Qing Li, Fan Zhang
Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici
Workshops:
THUMOS Challenge 2015
Program organizers include: Alexander Gorban, Rahul Sukthankar
DeepVision: Deep Learning in Computer Vision 2015
Invited Speaker: Rahul Sukthankar
Large Scale Visual Commerce (LSVisCom)
Panelist: Luc Vincent
Large-Scale Video Search and Mining (LSVSM)
Invited Speaker and Panelist: Rahul Sukthankar
Program Committee includes: Apostol Natsev
Vision meets Cognition: Functionality, Physics, Intentionality and Causality
Program Organizers include: Peter Battaglia
Big Data Meets Computer Vision: 3rd International Workshop on Large Scale Visual Recognition and Retrieval (BigVision 2015)
Program Organizers include: Samy Bengio
Includes speaker Christian Szegedy - Scalable approaches for large scale vision
Observing and Understanding Hands in Action (Hands 2015)
Program Committee includes: Murphy Stein
Fine-Grained Visual Categorization (FGVC3)
Program Organizers include: Anelia Angelova
Large-scale Scene Understanding Challenge (LSUN)
Winners of the Scene Classification Challenge: Julian Ibarz, Christian Szegedy and Vincent Vanhoucke
Winners of the Caption Generation Challenge: Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan
Looking from above: when Earth observation meets vision (EARTHVISION)
Technical Committee includes: Andreas Wendel
Computer Vision in Vehicle Technology: Assisted Driving, Exploration Rovers, Aerial and Underwater Vehicles
Invited Speaker: Andreas Wendel
Program Committee includes: Andreas Wendel
Women in Computer Vision (WiCV)
Invited Speaker: Mei Han
ChaLearn Looking at People (sponsor)
Fine-Grained Visual Categorization (FGVC3) (sponsor)