Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms
Abstract
Realistic 3D indoor scene datasets have enabled significant recent progress in computer vision, scene understanding, autonomous navigation, and 3D reconstruction. But the scale, diversity, and customizability of existing datasets is limited, and it is time-consuming and expensive to scan and annotate more. Fortunately, combinatorics is on our side: there are enough individual rooms in existing 3D scene datasets, if there was but a way to recombine them into new layouts. In this paper, we propose the task of generating novel 3D floor plans from existing 3D rooms. We identify three sub-tasks of this problem: generation of 2D layout, retrieval of compatible 3D rooms, and deformation of 3D rooms to fit the layout. We then discuss different strategies for solving the problem, and design two representative pipelines: one uses available 2D floor plans to guide selection and deformation of 3D rooms; the other learns to retrieve a set of compatible 3D rooms and combine them into novel layouts. We design a set of metrics that evaluate the generated results with respect to each of the three subtasks and show that different methods trade off performance on these subtasks. Finally, we survey downstream tasks that benefit from generated 3D scenes and discuss strategies in selecting the methods most appropriate for the demands of these tasks.
Supporting Information
Filename | Description |
---|---|
cgf14357-sup-0001-S1.mp474.6 KB | Supporting Information |
cgf14357-sup-0002-S1.mp487.7 KB | Supporting Information |
cgf14357-sup-0003-S1.mp483.9 KB | Supporting Information |
cgf14357-sup-0004-S1.mp445.3 MB | Supporting Information |
cgf14357-sup-0005-S1.pdf3.8 MB | Supporting Information |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
- Anderson P., Chang A., Chaplot D. S., Dosovitskiy A., Gupta S., Koltun V., Kosecka J., Malik J., Mottaghi R., Savva M., Zamir A. R.: On evaluation of embodied navigation agents, 2018. arXiv:1807.06757. 10
- Avetisyan A., Khanova T., Choy C., Dash D., Dai A., Niessner M.: SceneCAD: Predicting object alignments and layouts in RGB-D scans. In Eur. Conf. Comput. Vis. (2020). 1
- Anderson P., Wu Q., Teney D., Bruce J., Johnson M., Sünderhauf N., Reid I., Gould S., Van Den Hengel A.: Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 3674–3683. 1
- Bishop C. M.: Mixture density networks. Aston University, 1994. 6
- Beattie C., Leibo J. Z., Teplyashin D., Ward T., Wainwright M., Küttler H., Lefrancq A., Green S., Valdés V., Sadik A., et al.: Deepmind lab, 2016. arXiv:1612.03801. 2
- Bao F., Yan D.-M., Mitra N. J., Wonka P.: Generating and exploring good building layouts. ACM Transactions on Graphics (TOG) 32, 4 (2013), 1–10. 2
- Chang A., Dai A., Funkhouser T., Halber M., Niessner M., Savva M., Song S., Zeng A., Zhang Y.: Matterport3D: Learning from RGB-D data in indoor environments. In 3DV (2017). 2, 3, 4
- Chen C., Jain U., Schissler C., Gari S. V. A., Al-Halah Z., Ithapu V. K., Robinson P., Grauman K.: SoundSpaces: Audio-visual navigation in 3D environments. In Proceedings of the European Conference on Computer Vision (ECCV) (2020). 1
- Chaudhuri S., Kalogerakis E., Guibas L., Koltun V.: Probabilistic reasoning for assembly-based 3D modeling. ACM transactions on graphics (TOG) 30, 4 (2011), 1–10. 2
- Chechik G., Sharma V., Shalit U., Bengio S.: Large scale online learning of image similarity through ranking. Journal of Machine Learning Research 11, Mar (2010), 1109–1135. 6
- Dai A., Chang A. X., Savva M., Halber M., Funkhouser T., Niessner M.: ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. In Proc. Computer Vision and Pattern Recognition (CVPR), IEEE (2017). 2, 11
- Fu H., Cai B., Gao L., Zhang L., Wang J., Li C., Zeng Q., Sun C., Jia R., Zhao B., Zhang H.: 3D-FRONT: 3D furnished rooms with layouts and semantics, 2020. arXiv:2011.09127. 2
- Funkhouser T., Kazhdan M., Shilane P., Min P., Kiefer W., Tal A., Rusinkiewicz S., Dobkin D.: Modeling by example. ACM transactions on graphics (TOG) 23, 3 (2004), 652–663. 2
- Floater M. S.: Mean value coordinates. Comput. Aided Geom. Des. 20, 1 (Mar. 2003), 19–27. 7
- Fisher M., Ritchie D., Savva M., Funkhouser T., Hanrahan P.: Example-based synthesis of 3D object arrangements. ACM Transactions on Graphics (TOG) 31, 6 (2012), 1–11. 2
- Gupta S., Davidson J., Levine S., Sukthankar R., Malik J.: Cognitive mapping and planning for visual navigation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 2616–2625. 1
- Gilmer J., Schoenholz S. S., Riley P. F., Vinyals O., Dahl G. E.: Neural message passing for quantum chemistry. In Proceedings of the International Conference on Machine Learning (2017). 6
- Hu R., Huang Z., Tang Y., van Kaick O., Zhang H., Huang H.: Graph2plan: Learning floorplan generation from layout graphs. ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH 39, 4 (2020). 2, 3
- Heusel M., Ramsauer H., Unterthiner T., Nessler B., Hochreiter S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Proceedings of the 31st International Conference on Neural Information Processing Systems (2017), pp. 6629–6640. 8
- Juliani A., Khalifa A., Berges V.-P., Harper J., Teng E., Henry H., Crespi A., Togelius J., Lange D.: Obstacle tower: A generalization challenge in vision, control, and planning. In International Joint Conference on Artificial Intelligence (IJCAI) (2019). 2
- Jiang C., Sud A., Makadia A., Huang J., Niessner M., Funkhouser T.: Local implicit grid representations for 3D scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020). 1
- Kingma D. P., Ba J.: Adam: A Method for Stochastic Optimization. In Int. Conf. Learn. Represent. (2015). 6
- Kalogerakis E., Chaudhuri S., Koller D., Koltun V.: A probabilistic model for component-based shape synthesis. ACM Transactions on Graphics (TOG) 31, 4 (2012), 1–11. 2
- Kolve E., Mottaghi R., Gordon D., Zhu Y., Gupta A., Farhadi A.: AI2-THOR: an interactive 3D environment for visual AI, 2017. arXiv:1712.05474. 2
- Lifull home's dataset. https://www.nii.ac.jp/dsc/idr/en/lifull/, 2016. Accessed: 2016-11-2. 3
- Li M., Patil A. G., Xu K., Chaudhuri S., Khan O., Shamir A., Tu C., Chen B., Cohen-Or D., Zhang H.: Grains: Generative recursive autoencoders for indoor scenes. ACM Transactions on Graphics (TOG) 38, 2 (2019), 1–16. 2
- Liu H., Yang Y.-L., Alhalawani S., Mitra N. J.: Constraint-aware interior layout exploration for pre-cast concrete-based buildings. The Visual Computer 29, 6–8 (2013), 663–673. 2
- Merrell P., Schkufza E., Koltun V.: Computer-generated residential building layouts. ACM Transactions on Graphics (TOG) 29, 6 (2010). 2
- Merrell P., Schkufza E., Li Z., Agrawala M., Koltun V.: Interactive furniture layout using interior design guidelines. ACM transactions on graphics (TOG) 30, 4 (2011), 1–10. 2
- Nauata N., Chang K.-H., Cheng C.-Y., Mori G., Furukawa Y.: House-GAN: Relational generative adversarial networks for graph-constrained house layout generation. In European Conference on Computer Vision (2020), pp. 162–177. 2, 3
- Para W., Guerrero P., Kelly T., Guibas L., Wonka P.: Generative layout modeling using constraint graphs, 2020. arXiv: 2011.13417. 3, 6
- Qi S., Zhu Y., Huang S., Jiang C., Zhu S.-C.: Human-centric indoor scene synthesis using stochastic grammar. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 5899–5908. 2
- Ritchie D., Wang K., Lin Y.-A.: Fast and flexible indoor scene synthesis via deep convolutional generative models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 6182–6190. 2
- Savva M., Kadian A., Maksymets O., Zhao Y., Wijmans E., Jain B., Straub J., Liu J., Koltun V., Malik J., Parikh D., Batra D.: Habitat: A platform for embodied AI research. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019). 1, 10
- Sung M., Su H., Kim V. G., Chaudhuri S., Guibas L.: ComplementMe: Weakly-supervised component suggestions for 3D modeling. ACM Transactions on Graphics (TOG) 36, 6 (2017), 226. 2, 5, 6
- Schulman J., Wolski F., Dhariwal P., Radford A., Klimov O.: Proximal policy optimization algorithms, 2017. arXiv: 1707.06347. 10
- Straub J., Whelan T., Ma L., Chen Y., Wijmans E., Green S., Engel J. J., Mur-Artal R., Ren C., Verma S., et al.: The Replica dataset: A digital replica of indoor spaces, 2019. arXiv: 1906.05797. 2
- Wu W., Fan L., Liu L., Wonka P.: MIQP-based layout design for building interiors. Computer Graphics Forum 37, 2 (2018), 511–521. 2, 6
- Wu W., Fu X.-M., Tang R., Wang Y., Qi Y.-H., Liu L.: Data-driven interior plan generation for residential buildings. ACM Transactions on Graphics (TOG) 38, 6 (2019), 234. 2, 3, 4, 8
- Wijmans E., Kadian A., Morcos A., Lee S., Essa I., Parikh D., Savva M., Batra D.: DD-PPO: Learning near-perfect pointgoal navigators from 2.5 billion frames. In International Conference on Learning Representations (ICLR) (2020). 10, 11
- Wang K., Lin Y.-A., Weissmann B., Savva M., Chang A. X., Ritchie D.: PlanIt: Planning and instantiating indoor scenes with relation graph and spatial prior networks. ACM Transactions on Graphics (TOG) 38, 4 (2019), 132. 2
- Wang K., Savva M., Chang A. X., Ritchie D.: Deep convolutional priors for indoor scene synthesis. ACM Transactions on Graphics (TOG) 37, 4 (2018), 1–14. 2
- Xia F., Zamir A. R., He Z., Sax A., Malik J., Savarese S.: Gibson Env: Real-world perception for embodied agents. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 9068–9079. 1, 2, 10, 11
- Yan C., Misra D. K., Bennett A., Walsman A., Bisk Y., Artzi Y.: CHALET: Cornell house agent learning environment, 2018. arXiv:1801.07357. 2
- Yu L. F., Yeung S. K., Tang C. K., Terzopoulos D., Chan T. F., Osher S. J.: Make it home: automatic optimization of furniture arrangement. ACM Transactions on Graphics (TOG)-Proceedings of ACM SIGGRAPH 30, 4 (2011). 2
- Zhang Y., Funkhouser T.: Deep depth completion of a single RGB-D image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 175–185. 1
- Zamir A. R., Sax A., Shen W., Guibas L. J., Malik J., Savarese S.: Taskonomy: Disentangling task transfer learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (2018), pp. 3712–3722. 1
- Zhang Y., Song S., Yumer E., Savva M., Lee J.-Y., Jin H., Funkhouser T.: Physically-based rendering for indoor scene understanding using convolutional neural networks. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). 2
- Zhou Y., While Z., Kalogerakis E.: SceneGraphNet: Neural message passing for 3D indoor scene augmentation. In Proceedings of the IEEE International Conference on Computer Vision (2019), pp. 7384–7392. 2
- Zhang Z., Yang Z., Ma C., Luo L., Huth A., Vouga E., Huang Q.: Deep generative modeling for scene synthesis via hybrid representations. ACM Transactions on Graphics 39, 2 (2020). 2