Parallelization of group-based skyline computation for multi-core processors
Corresponding Author
Haoyang Zhu
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Correspondence
Haoyang Zhu, College of Computer, National University of Defense Technology, Changsha, Hunan, China.
Email: [email protected]
Search for more papers by this authorPeidong Zhu
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Search for more papers by this authorXiaoyong Li
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Academy of Ocean Science and Engineering, National University of Defense Technology, Changsha, Hunan, China
Search for more papers by this authorQiang Liu
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Search for more papers by this authorPeng Xun
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Search for more papers by this authorCorresponding Author
Haoyang Zhu
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Correspondence
Haoyang Zhu, College of Computer, National University of Defense Technology, Changsha, Hunan, China.
Email: [email protected]
Search for more papers by this authorPeidong Zhu
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Search for more papers by this authorXiaoyong Li
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Academy of Ocean Science and Engineering, National University of Defense Technology, Changsha, Hunan, China
Search for more papers by this authorQiang Liu
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Search for more papers by this authorPeng Xun
College of Computer, National University of Defense Technology, Changsha, Hunan, China
Search for more papers by this authorSummary
Skyline computation is particularly useful in multi-criteria decision-making applications. However, it is inadequate to answer queries that need to analyze not only individual points but also groups of points. Compared to the traditional skyline computation, computing group-based skyline is much more complicated and expensive. This computational challenge promotes us to use modern computing platforms to accelerate the computation. In this paper, we introduce a novel multi-core algorithm to compute group-based skyline. We first compute the skyline layers of a data set in parallel, which are a critical intermediate result. In the algorithm, we maintain an efficiently updatable data structure for the shared global skyline layers, which is used to minimize dominance tests and maintain high throughput. Then we design an efficient parallel algorithm to find group-based skyline based on the skyline layers. Extensive experimental results on real and synthetic data sets show that our algorithms achieve 10-fold speedup with 16 parallel threads over state-of-the-art sequential algorithms on challenging workloads.
REFERENCES
- 1Börzsönyi S, Kossmann D, Stocker K. The skyline operator. In: Proceedings of the 17th International Conference on Data Engineering; April 2–6, 2001; Heidelberg, Germany: 421-430.
- 2Im H, Park S. Group skyline computation. Inf Sci. 2012; 188: 151-169.
- 3Li C, Zhang N, Hassan N, Rajasekaran S, Das G. On skyline groups. In: 21st ACM International Conference on Information and Knowledge Management, CIKM'12; October 29–November 02, 2012; Maui, Hi, USA: 2119-2123.
- 4Chung YC, Su IF, Lee C. Efficient computation of combinatorial skyline queries. Inf Syst. 2013; 38(3): 369-387.
- 5Liu J, Xiong L, Pei J, Luo J, Zhang H. Finding Pareto optimal groups: group-based skyline. PVLDB. 2015; 8(13): 2086-2097.
- 6Zhang N, Li C, Hassan N, Rajasekaran S, Das G. On skyline groups. IEEE Trans Knowl Data Eng. 2014; 26(4): 942-956.
- 7Bøgh KS, Assent I, Magnani M. Efficient gpu-based skyline computation. In: Proceedings of the Ninth International Workshop on Data Management on New Hardware, Damon 1013; June 24, 2013; New York, NY, USA: 5.
- 8Choi W, Liu L, Yu B. Multi-criteria decision making with skyline computation. In: IEEE 13th International Conference on Information Reuse & Integration, IRI 2012; August 8, 2012; Las Vegas, NV, USA: 316-323.
- 9Mullesgaard K, Pederseny JL, Lu H, Zhou Y. Efficient skyline computation in MapReduce. In: Proceedings of the 17th International Conference on Extending Database Technology, EDBT 2014; March 24–28, 2014; Athens, Greece: 37-48.
- 10Park Y, Min JK, Shim K. Parallel computation of skyline and reverse skyline queries using MapReduce. PVLDB. 2013; 6(14): 2002-2013.
- 11Zhang B, Zhou S, Guan J. Adapting skyline computation to the MapReduce framework: algorithms and experiments. In: Database Systems for Adanced Applications - 16th International Conference, DASFAA 2011, International Workshops: GDB, SIM3, FlashDB, SNSMW, DaMEN, DQIS. Proceedings; April 22–25, 2011; Hong Kong, China: 403-414.
- 12Li X, Wang Y, Li X, Wang X, Yu J. GDPS: an efficient approach for skyline queries over distributed uncertain data. Big Data Res. 2014; 1: 23-36.
10.1016/j.bdr.2014.07.003 Google Scholar
- 13Li X, Wang Y, Li X, Wang Y. Parallel skyline queries over uncertain data streams in cloud computing environments. IJWGS. 2014; 10(1): 24-53.
- 14Chester S, Sidlauskas D, Assent I, Bøgh KS. Scalable parallelization of skyline computation for multi-core processors. In: 31st IEEE International Conference on Data Engineering, ICDE 2015; April 13–17, 2015; Seoul, South Korea: 1083-1094.
- 15Park S, Kim T, Park J, Kim J, Im H. Parallel skyline computation on multicore architectures. In: Proceedings of the 25th International Conference on Data Engineering, ICDE 2009; March 29–April 2, 2009; Shanghai, China: 760-771.
- 16Liknes S, Vlachou A, Doulkeridis C, Nørvåg K. Apskyline: improved skyline computation for multicore architectures. In: Database Systems for Advanced Applications - 19th International Conference, DASFAA 2014. Proceedings, Part I; April 21–24, 2014; Bali, Indonesia: 312-326.
- 17Cho S, Lee J, Hwang S, Han H, Lee S. Vskyline: vectorization for efficient skyline computation. SIGMOD Record. 2010; 39(2): 19-26.
- 18Kossmann D, Ramsak F, Rost S. Shooting stars in the sky: an online algorithm for skyline queries. In: VLDB 2002, Proceedings of 28th International Conference on Very Large Data Bases; August 20–23, 2002; Hong Kong, China: 275-286.
- 19Lee KCK, Zheng B, Li H, Lee W. Approaching the skyline in Z order. In: Proceedings of the 33rd International Conference on Very Large Data Bases; September 23–27, 2007; University of Vienna, Austria: 279-290.
- 20Kirkpatrick DG, Seidel R. Output-size sensitive algorithms for finding maximal vectors. In: Proceedings of the First Annual Symposium on Computational Geometry; June 5–7, 1985; Baltimore, Maryland, USA: 89-96.
- 21Liu J, Xiong L, Xu X. Faster output-sensitive skyline computation algorithm. Inf Process Lett. 2014; 114(12): 710-713.
- 22Endres M, Roocks P, Kießling W. Scalagon: an efficient skyline algorithm for all seasons. In: Database Systems for Advanced Applications - 20th International Conference, DASFAA 2015, Proceedings, Part II; April 20–23, 2015; Hanoi, Vietnam: 292-308.
- 23Goncalves M, Vidal M. Reaching the top of the skyline: an efficient indexed algorithm for top-k skyline queries. In: Database and Expert Systems Applications, 20th International Conference, DEXA 2009 Proceedings; August 31–September 4, 2009; Linz, Austria: 471-485.
- 24Brando C, Goncalves M, González V. Evaluating top-k skyline queries over relational databases. In: Database and Expert Systems Applications, 18th International Conference, DEXA 2007, Proceedings; September 3–7, 2007; Regensburg, Germany: 254-263.
- 25Preisinger T, Endres M. Looking for the best, but not too many of them: multi-level and top-k skylines. Int J Adv Softw. 2015; 8(3): 467-480.
- 26Endres M. Behind the skyline. In: DBKDA 2015: The Seventh International Conference on Advances in Databases, Knowledge, and Data Applications, Rome, Italy; 2015: 141-146.
- 27Lee J, You G, Hwang S. Personalized top-k skyline queries in high-dimensional space. Inf Syst. 2009; 34(1): 45-61.
- 28Chomicki J, Godfrey P, Gryz J, Liang D. Skyline with presorting. In: Proceedings of the 19th International Conference on Data Engineering; March 5–8, 2003; Bangalore, India: 717-719.
- 29Bartolini I, Ciaccia P, Patella M. Efficient sort-based skyline evaluation. ACM Trans Database Syst. 2008; 33(4): 31:1-31:49.
- 30Zhang S, Mamoulis N, Cheung DW. Scalable skyline computation using object-based space partitioning. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009; June 29–July 2, 2009; Providence, Rhode Island, USA: 483-494.
- 31Lee J, Hwang S. Scalable skyline computation using a balanced pivot selection technique. Inf Syst. 2014; 39: 1-21.
- 32Lee J, Hwang S. Bskytree: scalable skyline computation using a balanced pivot selection. In: EDBT 2010, 13th International Conference on Extending Database Technology, Proceedings; March 22–26, 2010; Lausanne, Switzerland: 195-206.
- 33Pei J, Jiang B, Lin X, Yuan Y. Probabilistic skylines on uncertain data. In: Proceedings of the 33rd International Conference on Very Large Data Bases; September 23–27, 2007; University of Vienna, Austria: 15-26.
- 34Lu H, Jensen CS, Zhang Z. Flexible and efficient resolution of skyline query size constraints. IEEE Trans Knowl Data Eng. 2011; 23(7): 991-1005.
- 35Yuan Y, Lin X, Liu Q, Wang W, Yu JX, Zhang Q. Efficient computation of the skyline cube. In: Proceedings of the 31st International Conference on Very Large Data Bases; August 30–September 2, 2005; Trondheim, Norway: 241-252.
- 36Zhang Q, Ye P, Lin X, Zhang Y. Skyline probability over uncertain preferences. In: Joint 2013 EDBT/ICDT Conferences, EDBT '13 Proceedings; March 18–22, 2013; Genoa, Italy: 395-405.
- 37Zhu H, Zhu P, Li X, Liu Q. Top-k skyline groups queries. In: Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017; March 21–24, 2017; Venice, Italy: 442-445.
- 38Papadias D, Tao Y, Fu G, Seeger B. Progressive skyline computation in database systems. ACM Trans Database Syst. 2005; 30(1): 41-82.