Illumination estimation is an essential problem in computer vision, graphics and augmented reality. In this paper, we propose a learning based method to recover low-frequency scene illumination represented as spherical harmonic (SH) functions by pairwise photos from rear and front cameras on mobile devices. An end-to-end deep convolutional neural network (CNN) structure is designed to process images on symmetric views and predict SH coefficients. We introduce a novel Render Loss to improve the rendering quality of the predicted illumination. A high quality high dynamic range (HDR) panoramic image dataset was developed for training and evaluation. Experiments show that our model produces visually and quantitatively superior results compared to the state-of-the-arts. Moreover, our method is practical for mobile-based applications.

References

Barron J. T., Malik J.: Intrinsic scene properties from a single rgb-d image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2013), pp. 17–24. 1, 2
Google Scholar
Barron J. T., Malik J.: Shape, illumination, and reflectance from shading. IEEE transactions on pattern analysis and machine intelligence 37, 8 (2015), 1670–1687. 1, 2
10.1109/TPAMI.2014.2377712
PubMed Web of Science® Google Scholar
Chang A. X., Funkhouser T., Guibas L., Hanrahan P., Huang Q., Li Z., Savarese S., Savva M., Song S., Su H., et al.: Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015). 2
Google Scholar
Chaitanya C. R. A., Kaplanyan A. S., Schied C., Salvi M., Lefohn A., Nowrouzezahrai D., Aila T.: Interactive reconstruction of monte carlo image sequences using a recurrent denoising autoencoder. ACM Transactions on Graphics (TOG) 36, 4 (2017), 98. 2
10.1145/3072959.3073601
Web of Science® Google Scholar
Calian D. A., Lalonde J.-F., Gotardo P., Simon T., Matthews I., Mitchell K.: From faces to outdoor light probes. In Computer Graphics Forum (2018), Vol. 37, Wiley Online Library, pp. 51–61. 6
10.1111/cgf.13341
Web of Science® Google Scholar
Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L.: Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (2009), IEEE, pp. 248–255. 2
Google Scholar
Debevec P.: Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques (1998), ACM, pp. 189–198. 2
Google Scholar
Girshick R., Donahue J., Darrell T., Malik J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (2014), pp. 580–587. 2
Google Scholar
Google: Tf-mesh-renderer. https://github.com/google/tf_mesh_renderer, 2017. 2
Google Scholar
Green R.: Spherical harmonic lighting: The gritty details. In Archives of the Game Developers Conference (2003), Vol. 56, p. 4. 3
Google Scholar
Gardner M.-A., Sunkavalli K., Yumer E., Shen X., Gambaretto E., Gagné C., Lalonde J.-F.: Learning to predict indoor illumination from a single image. ACM Transactions on Graphics (SIGGRAPH Asia) 9, 4 (2017). 1, 2, 5, 6
Google Scholar
Haber T., Fuchs C., Bekaer P., Seidel H.-P., Goesele M., Lensch H. P.: Relighting objects from image collections. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (2009), IEEE, pp. 627–634. 1, 2
Google Scholar
Hold-Geoffroy Y., Sunkavalli K., Hadap S., Gambaretto E., Lalonde J.-F.: Deep outdoor illumination estimation. In IEEE International Conference on Computer Vision and Pattern Recognition (2017). 1, 2, 5, 6
Google Scholar
Hara K., Nishino K., Ikeuchi K.: Multiple light sources and reflectance property estimation based on a mixture of spherical distributions. In Tenth IEEE International Conference on Computer Vision (2005), pp. 1627–1634. 3
Google Scholar
HE K., Zhang X., Ren S., Sun J.: Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778. 2, 4
Google Scholar
Karras T., Aila T., Laine S., Herva A., Lehtinen J.: Audio-driven facial animation by joint end-to-end learning of pose and emotion. ACM Transactions on Graphics (TOG) 36, 4 (2017), 94. 2
10.1145/3072959.3073658
Web of Science® Google Scholar
Karsch K., Hedau V., Forsyth D., Hoiem D.: Rendering synthetic objects into legacy photographs. In ACM Transactions on Graphics (TOG) (2011), Vol. 30, ACM, p. 157. 1
Web of Science® Google Scholar
Krizhevsky A., Sutskever I., Hinton G. E.: Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, K. Q. Weinberger, (Eds.). Curran Associates, Inc., 2012, pp. 1097–1105. 2, 4
Google Scholar
Karsch K., Sunkavalli K., Hadap S., Carr N., Jin H., Fonte R., Sittig M., Forsyth D.: Automatic scene inference for 3d object compositing. ACM Transactions on Graphics (TOG) 33, 3 (2014), 32. 2
10.1145/2602146
Web of Science® Google Scholar
LeCun Y., Bottou L., Bengio Y., Haffner P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324. 2
10.1109/5.726791
Web of Science® Google Scholar
Lalonde J.-F., Matthews I.: Lighting estimation in outdoor image collections. In 3D Vision (3DV), 2014 2nd International Conference on (2014), Vol. 1, IEEE, pp. 131–138. 1, 2
Google Scholar
Lombardi S., Nishino K.: Reflectance and illumination recovery in the wild. IEEE transactions on pattern analysis and machine intelligence 38, 1 (2016), 129–141. 1, 2
10.1109/TPAMI.2015.2430318
PubMed Web of Science® Google Scholar
Liu B., Xu K., Martin R. R.: Static scene illumination estimation from videos with applications. Journal of Computer Science and Technology 32, 3 (2017), 430–442. 2
10.1007/s11390-017-1734-y
Web of Science® Google Scholar
Mi: Mi sphere camera kit. https://www.mi.com/us/mi-sphere-camera-kit/, 2017. 6
Google Scholar
Manakov A., Restrepo J., Klehm O., Hegedus R., Eisemann E., Seidel H.-P., Ihrke I.: A reconfigurable camera addon for high dynamic range, multispectral, polarization, and light-field imaging. ACM Transactions on Graphics 32, 4 (2013), 47–1. 2
10.1145/2461912.2461937
Web of Science® Google Scholar
Ng R., Ramamoorthi R., Hanrahan P.: All-frequency shadows using non-linear wavelet lighting approximation. Proc Acm Siggraph 22, 3 (2003), 376–381. 3
10.1145/882262.882280
Web of Science® Google Scholar
Orts-Escolano S., Rhemann C., Fanello S., Chang W., Kowdle A., Degtyarev Y., Kim D., Davidson P. L., Khamis S., Dou M., et al.: Holoportation: Virtual 3d teleportation in real-time. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (2016), ACM, pp. 741–754. 1
Google Scholar
Ronneberger O., Fischer P., Brox T.: U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (2015), Springer, pp. 234–241. 2
Google Scholar
Ramamoorthi R., Hanrahan P.: An efficient representation for irradiance environment maps. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques (2001), ACM, pp. 497–500. 3, 4
Google Scholar
Rogge L., Klose F., Stengel M., Eisemann M., Magnor M.: Garment replacement in monocular video sequences. ACM Transactions on Graphics (TOG) 34, 1 (2014), 6. 1
10.1145/2634212
Web of Science® Google Scholar
Shan Q., Adams R., Curless B., Furukawa Y., Seitz S. M.: The visual turing test for scene reconstruction. In 3DTV-Conference, 2013 International Conference on (2013), IEEE, pp. 25–32. 1, 2
Google Scholar
Shi J., Dong Y., Tong X., Chen Y.: Efficient intrinsic image decomposition for rgbd images. In Proceedings of the 21st ACM Symposium on Virtual Reality Software and Technology (2015), ACM, pp. 17–25. 2
Google Scholar
Shu Z., Yumer E., Hadap S., Sunkavalli K., Shechtman E., Samaras D.: Neural face editing with intrinsic image disentangling. 5444–5453. 2
Google Scholar
Simonyan K., Zisserman A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). 2, 4
Google Scholar
Tocci M. D., Kiser C., TOCCI N., Sen P.: A versatile hdr video production system. ACM Transactions on Graphics (TOG) 30, 4 (2011), 41. 2
10.1145/2010324.1964936
Web of Science® Google Scholar
Todo H., Yamaguchi Y.: Estimating reflectance and shape of objects from a single cartoon-shaded image. Computational Visual Media 3, 1 (2017), 21–31. 2
10.1007/s41095-016-0066-0
Google Scholar
VICON: Boujou. https://www.vicon.com/products/software/boujou, 2017. 1
Google Scholar
WIKI: Virtualadvertising. https://en.wikipedia.org/wiki/Virtual_advertising, 2017. 1
Google Scholar
Yi R., Zhu C., Tan P., Lin S.: Faces as lighting probes via unsupervised deep highlight extraction. arXiv preprint arXiv:1803.06340 (2018). 6
Google Scholar
Zhang E., Cohen M. F., Curless B.: Emptying, refurnishing, and relighting indoor spaces. ACM Transactions on Graphics (TOG) 35, 6 (2016), 174. 1, 2
10.1145/2980179.2982432
Web of Science® Google Scholar
Zhou B., Lapedriza A., Khosla A., Oliva A., Torralba A.: Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017). 3, 5
Google Scholar
Zhou Z., Shu B., Zhuo S., Deng X., Tan P., Lin S.: Image-based clothes animation for virtual fitting. In SIGGRAPH Asia (2012), pp. 33: 1–33:4. 1
Google Scholar

Citing Literature

Volume37, Issue7

October 2018

Pages 213-221

Learning Scene Illumination by Pairwise Photos from Rear and Front Mobile Cameras

Abstract

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Learning Scene Illumination by Pairwise Photos from Rear and Front Mobile Cameras

Abstract

References

Citing Literature

References

Related

Information