In this paper, we present a matrix assembly technique for arbitrary polynomial order finite element simulations on simplex meshes for graphics processing units (GPU). Compared to the current state of the art in GPU-based matrix assembly, we avoid the need for an intermediate sparse matrix and perform assembly directly into the final, GPU-optimized data structure. Thereby, we avoid the resulting 180% to 600% memory overhead, depending on polynomial order, and associated allocation time, while simplifying the assembly code and using a more compact mesh representation. We compare our method with existing algorithms and demonstrate significant speedups.

Supporting Information

References

Anzt H., Dongarra J., Flegar G., Quintana-Ortí E. S.: Batched Gauss-Jordan elimination for block-Jacobi preconditioner generation on GPUs. In Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores (2017), PMAM'17, pp. 1–10. doi:10.1145/3026937.3026940. 6
Google Scholar
Bender J., Deul C.: Adaptive cloth simulation using corotational finite elements. Computers & Graphics 37, 7 (2013), 820–829. doi:10.1016/j.cag.2013.04.008. 2
10.1016/j.cag.2013.04.008
Web of Science® Google Scholar
Babuška I., Guo B. Q.: The h, p and h-p version of the finite element method: Basis theory and applications. Advances in Engineering Software 15, 3–4 (1992), 159–174. doi:10.1016/0965-9978(92)90097-Y. 10
10.1016/0965‐9978(92)90097‐Y
Web of Science® Google Scholar
Bell N., Hoberock J.: Thrust 1.8.1, Mar. 2015. URL: https://thrust.github.io/. 5, 6
Google Scholar
Brooks Jr. F. P.: The Mythical Man-month, anniversary ed. Addison-Wesley Longman Publishing Co., Inc., 1995. ISBN 0-201-83595-9. 2
Google Scholar
CGNS Steering Committee: CGNS standard interface data structures 3.3 (rev 2), 2018. Retrieved 2018-06-08. URL: https://cgns.github.io/CGNS_docs_current/sids/sids.pdf. 4
Google Scholar
Cecka C., Lew A. J., Darve E.: Assembly of finite element methods on graphics processors. International Journal for Numerical Methods in Engineering 85, 5 (2010), 640–669. doi:10.1002/nme.2989. 3
10.1002/nme.2989
Web of Science® Google Scholar
Davis T.: SuiteSparse: A suite of sparse matrix packages, 2018. Retrieved 2018-06-08. URL: http://faculty.cse.tamu.edu/davis/suitesparse.html. 8
Google Scholar
Di Pietro D. A., Ern A.: Mathematical Aspects of Discontinuous Galerkin Methods. Springer Berlin Heidelberg, 2012. doi: 10.1007/978-3-642-22980-0. 4
10.1007/978‐3‐642‐22980‐0
Google Scholar
Guo X., Lange M., Gorman G., Mitchell L., Weiland M.: Developing a scalable hybrid MPI/OpenMP unstructured finite element model. Computers & Fluids 110 (2015), 227–234. doi: 10.1016/j.compfluid.2014.09.007. 2
10.1016/j.compfluid.2014.09.007
Web of Science® Google Scholar
Grinspun E.: The Basis Refinement Method. PhD thesis, California Institute of Technology, 2003. URL: https://thesis.library.caltech.edu/2324/. 10
Google Scholar
Hatcher A.: Algebraic Topology. Cambridge University Pr., 2002. ISBN 0-521-79540-0. 2, 3
Google Scholar
Hu Y., Zhou Q., Gao X., Jacobson A., Zorin D., Panozzo D.: Tetrahedral meshing in the wild. ACM Transactions on Graphics 37, 4 (2018), 60:1–60:14. doi:10.1145/3197517.3201353. 2
10.1145/3197517.3201353
Web of Science® Google Scholar
Irving G., Schroeder C., Fedkiw R.: Volume conserving finite element simulations of deformable models. ACM Transactions on Graphics 26, 3 (2007), 13:1–13:6. doi:10.1145/1276377.1276394. 2
10.1145/1276377.1276394
Web of Science® Google Scholar
Koschier D., Lipponer S., Bender J.: Adaptive tetrahedral meshes for brittle fracture simulation. In Proceedings of the 2014 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (2014), SCA ‘14, pp. 57–66. 2
Google Scholar
Komatitsch D., Michéa D., Erlebacher G.: Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA. Journal of Parallel and Distributed Computing 69, 5 (2009), 451–460. doi:10.1016/j.jpdc.2009.01.006. 2
10.1016/j.jpdc.2009.01.006
Web of Science® Google Scholar
Kranz J.: Methodik und Richtlinien für die Konstruktion von laseradditiv gefertigten Leichtbaustrukturen. Light Engineering für die Praxis. Springer Vieweg, 2017. doi:10.1007/978-3-662-55339-8. 1
10.1007/978‐3‐662‐55339‐8
Google Scholar
Liu Y., Jiao S., Wu W., De S.: GPU accelerated fast FEM deformation simulation. In 2008 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) (2008), pp. 606–609. doi:10.1109/APCCAS.2008.4746096. 1
Google Scholar
Liu W., Vinter B.: A framework for general sparse matrixmatrix multiplication on GPUs and heterogeneous processors. Journal of Parallel and Distributed Computing 85 (2015), 47–61. doi:10.1016/j.jpdc.2015.06.010. 3, 7, 10
10.1016/j.jpdc.2015.06.010
Web of Science® Google Scholar
Mueller-Roemer J. S., Altenhofen C., Stork A.: Ternary sparse matrix representation for volumetric mesh subdivision and processing on GPUs. Computer Graphics Forum 36, 5 (2017), 59–69. doi:10.1111/cgf.13245. 2, 5
10.1111/cgf.13245
Web of Science® Google Scholar
Merill D.: CUB 1.8.0, Feb. 2018. URL: https://nvlabs.github.io/cub/. 5, 6
Google Scholar
Müller M., Gross M.: Interactive virtual materials. In Proceedings of Graphics Interface 2004 (2004), GI ‘04, pp. 239–246. 2
Google Scholar
NVIDIA Corporation: CUDA C programming guide, May 2018. Version 9.2. URL: http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf. 5
Google Scholar
Reguly I. Z., Giles M. B.: Finite element algorithms and data structures on graphical processing units. International Journal of Parallel Programming 43, 2 (2015), 203–239. doi:10.1007/s10766-013-0301-6. 3
10.1007/s10766‐013‐0301‐6
Web of Science® Google Scholar
Thébault L., Petit E., Dinh Q.: Scalable and efficient implementation of 3D unstructured meshes computation: A case study on matrix assembly. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2015), PPoPP 2015, pp. 120–129. doi:10.1145/2688500.2688517. 3
Google Scholar
Vázquez F., Ortega G., Fernández J. J., Garzón E. M.: Improving the performance of the sparse matrix vector product with GPUs. In Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology (Bradford, United Kingdom, 2010), CIT ‘10, pp. 1146–1151. doi:10.1109/CIT.2010.208. 6
Google Scholar
Weber D., Bender J., Schnoes M., Stork A., Fellner D.: Efficient GPU data structures and methods to solve sparse linear systems in dynamics applications. Comput. Graphics Forum 32, 1 (2013), 16–26. doi:10.1111/j.1467-8659.2012.03227.x. 2, 5, 10
10.1111/j.1467-8659.2012.03227.x
PubMed Web of Science® Google Scholar
Willberg C., Duczek S., Vivar Perez J. M., Schmicker D., Gabbert U.: Comparison of different higher order finite element schemes for the simulation of Lamb waves. Computer Methods in Applied Mechanics and Engineering 241–244 (2012), 246–261. doi:10.1016/j.cma.2012.06.011. 3
10.1016/j.cma.2012.06.011
Web of Science® Google Scholar
Weber D., Kalbe T., Stork A., Fellner D., Goesele M.: Interactive deformable models with quadratic bases in Bernstein-Bézier-form. The Visual Computer 27, 6 (2011), 473–483. doi:10.1007/s00371-011-0579-6. 2, 3
10.1007/s00371‐011‐0579‐6
Web of Science® Google Scholar
Weber D., Mueller-Roemer J. S., Altenhofen C., Stork A., Fellner D.: Deformation simulation using cubic finite elements and efficient p-multigrid methods. Computers & Graphics 53, 2 (2015), 185–195. doi:10.1016/j.cag.2015.06.010. 3, 7
10.1016/j.cag.2015.06.010
Web of Science® Google Scholar
Zhu B., Lee M., Quigley E., Fedkiw R.: Codimensional non-Newtonian fluids. ACM Transactions on Graphics 34, 4 (2015), 115:1–115:9. doi:10.1145/2766981. 10
10.1145/2766981
Web of Science® Google Scholar
Zhu B., Quigley E., Cong M., Solomon J., Fedkiw R.: Codimensional surface tension flow on simplicial complexes. ACM Transactions on Graphics 33, 4 (2014), 111:1–111:11. doi:10.1145/2601097.2601201. 10
10.1145/2601097.2601201
Web of Science® Google Scholar
Zayer R., Steinberger M., Seidel H.: A GPU-adapted structure for unstructured grids. Computer Graphics Forum 36, 2 (2017), 495–507. doi:10.1111/cgf.13144. 2, 3, 5, 7, 10
10.1111/cgf.13144
Web of Science® Google Scholar
Zayer R., Steinberger M., Seidel H.: Sparse matrix assembly on the GPU through multiplication patterns. In 2017 IEEE High Performance Extreme Computing Conference (HPEC) (2017), pp. 1–8. doi:10.1109/HPEC.2017.8091057. 3, 7, 8, 9, 10
Google Scholar
Zienkiewicz O. C., Taylor R. L.: Finite Element Method: Volume 1, fifth ed. Butterworth-Heinemann, 2000. ISBN 0-750-65049-4. 3
10.1002/(SICI)1097-0207(20000110/30)47:1/3<9::AID-NME793>3.0.CO;2-P
Web of Science® Google Scholar

Citing Literature

Volume37, Issue7

October 2018

Pages 443-454

GPU-based Polynomial Finite Element Matrix Assembly for Simplex Meshes

Abstract

Supporting Information

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

GPU-based Polynomial Finite Element Matrix Assembly for Simplex Meshes

Abstract

Supporting Information

References

Citing Literature

References

Related

Information