Part mobility analysis is a significant aspect required to achieve a functional understanding of 3D objects. It would be natural to obtain part mobility from the continuous part motion of 3D objects. In this study, we introduce a self-supervised method for segmenting motion parts and predicting their motion attributes from a point cloud sequence representing a dynamic object. To sufficiently utilize spatiotemporal information from the point cloud sequence, we generate trajectories by using correlations among successive frames of the sequence instead of directly processing the point clouds. We propose a novel neural network architecture called PointRNN to learn feature representations of trajectories along with their part rigid motions. We evaluate our method on various tasks including motion part segmentation, motion axis prediction and motion range estimation. The results demon strate that our method outperforms previous techniques on both synthetic and real datasets. Moreover, our method has the ability to generalize to new and unseen objects. It is important to emphasize that it is not required to know any prior shape structure, prior shape category information or shape orientation. To the best of our knowledge, this is the first study on deep learning to extract part mobility from point cloud sequence of a dynamic object.

References

[Bel] Belongie S.: Rodrigues' rotation formula. From MathWorld–A Wolfram Web Resource, created by Eric W. Weisstein. https://mathworld.wolfram.com/RodriguesRotationFormula.html.
Google Scholar
[BM92] Besl P. J., McKay N. D.: A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis Machine Intelligence 14, 2 (Feb. 1992), 239–256.
10.1109/34.121791
Web of Science® Google Scholar
[BPDG19] Behl A., Paschalidou D., Donne S., Geiger A.: Pointflownet: Learning representations for rigid motion estimation from point clouds. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019).
Google Scholar
[CGF09] Chen X., Golovinskiy A., Funkhouser T.: A benchmark for 3D mesh segmentation. ACM Transactions on Graphics 28, 3 (2009), 1–12.
10.1145/1531326.1531379
PubMed Web of Science® Google Scholar
[CZ08] Chang W., Zwicker M.: Automatic registration for articulated shapes. Computer Graphics Forum 27, 5 (2008), 1459–1468.
10.1111/j.1467-8659.2008.01286.x
Web of Science® Google Scholar
[FRA11] Fayad J., Russell C., Agapito L.: Automated articulated structure and 3d shape recovery from point correspondences. In 2011 International Conference on Computer Vision (November 2011), pp. 431–438.
10.1109/ICCV.2011.6126272
Google Scholar
[HLRB13] Hermans T., Li F., Rehg J. M., Bobick A. F.: Learning contact locations for pushing and orienting unknown objects. In 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids) (October 2013), pp. 435–442.
10.1109/HUMANOIDS.2013.7030011
Google Scholar
[HLVK*17] Hu R., Li W., Van Kaick O., Shamir A., Zhang H., Huang H.: Learning to predict part mobility from a single static snapshot. ACM Transactions on Graphics. 36, 6 (2017), 227:1–227:13.
10.1145/3130800.3130811
CAS Web of Science® Google Scholar
[HS97] Hochreiter S., Schmidhuber J.: Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
10.1162/neco.1997.9.8.1735
CAS PubMed Web of Science® Google Scholar
[HSvK18] Hu R., Savva M., van Kaick O.: Functionality representations and applications for shape analysis. Computer Graphics Forum 37, 2 (2018), 603–624.
10.1111/cgf.13385
Web of Science® Google Scholar
[JSGC15] Jaimez M., Souiai M., Gonzalez-Jimenez J., Cremers D.: A primal-dual framework for real-time dense rgb-d scene flow. In 2015 IEEE International Conference on Robotics and Automation (ICRA) (May 2015), pp. 98–104.
10.1109/ICRA.2015.7138986
Google Scholar
[KL17] Klokov R., Lempitsky V.: Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In The IEEE International Conference on Computer Vision (ICCV) (Oct 2017).
10.1109/ICCV.2017.99
Google Scholar
[KLAK16] Kim Y., Lim H., Ahn S. C., Kim A.: Simultaneous segmentation, estimation and analysis of articulated motion from dense point cloud sequence. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (Oct 2016), pp. 1085–1092.
10.1109/IROS.2016.7759184
Google Scholar
[LBS*18] Li Y., Bu R., Sun M., Wu W., Di X., Chen B.: Pointcnn: Convolution on x-transformed points. In Advances in Neural Information Processing Systems 31. Curran Associates, Inc., 2018, pp. 820–830.
Google Scholar
[LQG19] Liu X., Qi C. R., Guibas L. J.: Flownet3d: Learning scene flow in 3d point clouds. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019).
Google Scholar
[LSZ*19] Liu M., Shi Y., Zheng L., Xu K., Huang H., Manocha D.: Recurrent 3d attentional networks for end-to-end active object recognition. Computational Visual Media 5, 01 (2019), 92–104.
10.1007/s41095-019-0135-2
Google Scholar
[LWL*16] Li H., Wan G., Li H., Sharf A., Xu K., Chen B.: Mobility fitting using 4d ransac. Computer Graphics Forum 35, 5 (2016), 79–88.
10.1111/cgf.12965
CAS Web of Science® Google Scholar
[LYB19] Liu X., Yan M., Bohg J.: Meteornet: deep learning on dynamic 3d point cloud sequences. In The IEEE International Conference on Computer Vision (ICCV) (October 2019).
Google Scholar
[MMEB18] Martln-Martln R., Eppner C., Brock O.: The rbo dataset of articulated objects and interactions, 2018.
Google Scholar
[MS09] Myronenko A., Song X. B.: On the closed-form solution of the rotation matrix arising in computer vision problems. CoRR abs/0904.1613 (2009).
Google Scholar
[MS15] Maturana D., Scherer S.: 3d convolutional neural networks for landing zone detection from lidar. In 2015 IEEE International Conference on Robotics and Automation (ICRA) (May 2015), pp. 3471–3478.
10.1109/ICRA.2015.7139679
Google Scholar
[MTFA15] Myers A., Teo C. L., Fermller C., Aloimonos Y.: Affordance detection of tool parts from geometric features. In 2015 IEEE International Conference on Robotics and Automation (ICRA) (May 2015), pp. 1374–1381.
10.1109/ICRA.2015.7139369
Google Scholar
[MYY*13] Mitra N. J., Yang Y.-L., Yan D.-M., Li W., Agrawala M.: Illustrating how mechanical assemblies work. Communication of ACM 56, 1 (January 2013), 106–114.
10.1145/2398356.2398379
Web of Science® Google Scholar
[MZC*19] Mo K., Zhu S., Chang A. X., Yi L., Tripathi S., Guibas L. J., Su H.: PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019).
Google Scholar
[PB11] Papazov C., Burschka D.: Deformable 3d shape registration based on local similarity transforms. Computer Graphics Forum 30, 5 (2011), 1493–1502.
10.1111/j.1467-8659.2011.02023.x
Web of Science® Google Scholar
[PMR*20] Pais G., Miraldo P., Ramalingam S., Govindu V., Nascimento J., Chellappa R.: 3dregnet: a deep neural network for 3d point registration. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020), 7191–7201.
10.1109/CVPR42600.2020.00722
Google Scholar
[QYSG17] Qi C. R., Yi L., Su H., Guibas L. J.: Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 2017, pp. 5099–5108.
Google Scholar
[RLA*19] Roberts R., Lewis J. P., Anjyo K., Seo J., Seol Y.: Optimal and interactive keyframe selection for motion capture. Computational Visual Media 5, 02 (2019), 172–191.
10.1007/s41095-019-0138-z
Google Scholar
[SA07] Sorkine O., Alexa M.: As-rigid-as-possible surface modeling, 2007.
Google Scholar
[SMW06] Schaefer S., McPhail T., Warren J.: Image deformation using moving least squares. ACM Transactions on Graphics 25 (07 2006), 533–540.
10.1145/1141911.1141920
Web of Science® Google Scholar
[SSTN18] Suwajanakorn S., Snavely N., Tompson J. J., Norouzi M.: Discovery of latent 3d keypoints via end-to-end geometric reasoning. In Advances in Neural Information Processing Systems 31. Curran Associates, Inc., 2018, pp. 2059–2070.
Google Scholar
[VRS14] Vogel C., Roth S., Schindler K.: View-consistent 3d scene flow estimation over multiple frames. In Computer Vision – ECCV 2014 (Cham, 2014), Springer International Publishing, pp. 263–278.
10.1007/978-3-319-10593-2_18
Google Scholar
[WZS*19] Wang X., Zhou B., Shi Y., Chen X., Zhao Q., Xu K.: Shape2motion: Joint analysis of motion parts and attributes from 3d shapes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019).
Google Scholar
[XQM*20] Xiang F., Qin Y., Mo K., Xia Y., Zhu H., Liu F., Liu M., Jiang H., Yuan Y., Wang H., Yi L., Chang A. X., Guibas L. J., Su H.: SAPIEN: A simulated part-based interactive environment. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020).
Google Scholar
[YHL*18] Yi L., Huang H., Liu D., Kalogerakis E., Su H., Guibas L.: Deep part induction from articulated object pairs. ACM Transactions on Graphics 37, 6 (December 2018), 209:1–209:15.
Web of Science® Google Scholar
[YHY*19] Yan Z., Hu R., Yan X., Chen L., van Kaick O., Zhang H., Huang H.: Rpm-net: Recurrent prediction of motion and parts from point cloud. ACM Transactions on Graphics (Proceedings of SIGGRAPH ASIA 2019) 38, 6 (2019), 240:1–240:15.
Web of Science® Google Scholar
[YKC*16] Yi L., Kim V. G., Ceylan D., Shen I.-C., Yan M., Su H., Lu C., Huang Q., Sheffer A., Guibas L.: A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics (ToG) 35, 6 (2016), 1–12.
10.1145/2980179.2980238
Web of Science® Google Scholar
[YLX*16] Yuan Q., Li G., Xu K., Chen X., Huang H.: Space-time co-segmentation of articulated point cloud sequences. Computer Graphics Forum 35, 2 (2016), 419–429.
10.1111/cgf.12843
Web of Science® Google Scholar
[YP06] Yan J., Pollefeys M.: A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In Computer Vision – ECCV 2006 (Berlin, Heidelberg, 2006), Springer Berlin Heidelberg, pp. 94–106.
10.1007/11744085_8
Google Scholar
[YX16] Yan Z., Xiang X.: Scene flow estimation: a survey. arXiv preprint arXiv:1612.02590 (2016).
Google Scholar

Citing Literature

Volume40, Issue6

September 2021

Pages 104-116

Self-Supervised Learning of Part Mobility from Point Cloud Sequence

Abstract

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Self-Supervised Learning of Part Mobility from Point Cloud Sequence

Abstract

References

Citing Literature

References

Related

Information