You are here

Robust and efficient large-large table outer joins on distributed infrastructures

Authors: 

Long Cheng, Spyros Kotoulas, Tomas Ward, Georgios Theodoropoulos

Publication Type: 
Refereed Conference Meeting Proceeding
Abstract: 
Outerjoinsareubiquitousinmanyworkloadsbutaresensitivetoload- balancing problems. Current approaches mitigate such problems caused by data skew by using (partial) replication. However, contemporary replication-based approaches (1) introduce overhead, since they usually result in redundant data movement, (2) are sensitive to parameter tuning and value of data skew and (3) typically require that one side is small. In this paper, we propose a novel parallel algorithm, Redistribution and Efficient Query with Counters (REQC), aimed at robustness in terms of size of join sides, variation in skew and parameter tuning. Experimental results demonstrate that our algorithm is faster, more robust and less demanding in terms of network bandwidth, compared to the state-of-the-art.
Conference Name: 
Euro-Par 2014 Parallel Processing
Proceedings: 
Euro-Par 2014 Parallel Processing
Digital Object Identifer (DOI): 
10.1007/978-3-319-09873-9_22
Publication Date: 
25/08/2014
Conference Location: 
Portugal
Research Group: 
Institution: 
NUIM
Open access repository: 
No