Volume 29, Issue 18 e4194
RESEARCH ARTICLE

GPU processing of theta-joins

Christos Bellas

Christos Bellas

Department of Informatics, Aristotle University of Thessaloniki, 541 24 Greece

Search for more papers by this author
Anastasios Gounaris

Corresponding Author

Anastasios Gounaris

Department of Informatics, Aristotle University of Thessaloniki, 541 24 Greece

Correspondence

Anastasios Gounaris, Department of Informatics, Aristotle University of Thessaloniki, 541 24, Greece.

Email: [email protected]

Search for more papers by this author
First published: 27 June 2017
Citations: 4

Summary

The GPGPU paradigm has recently been employed to accelerate the processing of big amounts of data through the utilization of the massive parallelism offered by modern GPUs. To date, several techniques have been proposed for the implementation of simple select, aggregate, and equality join operations on GPUs. In this paper, we study the efficient implementation of theta-join queries between two relations using the CUDA framework. Theta-joins are notoriously slow and thus can benefit from massively parallel execution. However, their GPU-based implementation significantly differs from hash- and sort-based equality joins and needs to be carefully crafted. The implementation is driven by two main objectives. The first relates to the attainment of high efficiency in the parallelization through data reuse, which relates to the minimization of accesses to the slow global memory. The second is about the most efficient exploitation of the available memory given that, in general, it cannot hold the entire input and result. We propose a methodology for processing theta-joins on a GPU, which exploits the heterogeneous nature of GPGPU, while addressing memory limitations. Furthermore, we provide a series of implementation optimizations, which yield performance improvements of an order of magnitude.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.