Shape and Viewpoint without Keypoints

Shubham Goel
Angjoo Kanazawa
Jitendra Malik
University of California, Berkeley
In ECCV, 2020


Given an image collection of an object category, like birds, we propose a computational framework that given a single image of an object, predicts its 3D shape, viewpoint and texture, without using any 3D shape, viewpoints or keypoint supervision during training. On the right we show the input image and the results obtained by our method, shown from multiple views.

We present a learning framework that learns to recover the 3D shape, pose and texture from a single image, trained on an image collection without any ground truth 3D shape, multi-view, camera viewpoints or keypoint supervision. We approach this highly under-constrained problem in a "analysis by synthesis" framework where the goal is to predict the likely shape, texture and camera viewpoint that could produce the image with various learned category-specific priors. Our particular contribution in this paper is a representation of the distribution over cameras, which we call "camera-multiplex". Instead of picking a point estimate, we maintain a set of camera hypotheses that are optimized during training to best explain the image given the current shape and texture. We call our approach Unsupervised Category-Specific Mesh Reconstruction (U-CMR), and present qualitative and quantitative results on CUB, Pascal 3D and new web-scraped datasets. We obtain state-of-the-art camera prediction results and show that we can learn to predict diverse shapes and textures across objects using an image collection without any keypoint annotations or 3D ground truth.


Goel, Kanazawa, Malik.

Shape and Viewpoint without Keypoints

In ECCV, 2020.

[pdf]     [arXiv]     [bibtex]    


Short Video

Long Video



  Now available on [GitHub]


We thank Jasmine Collins for scraping the zappos shoes dataset and members of the BAIR community for helpful discussions. This work was supported in-part by eBay, Stanford MURI and the DARPA MCS program. This webpage template was borrowed from some colorful folks.