Skew-tolerant Key Distribution for Load Balancing in MapReduce
2012
SUMMARY The MapReduce is a
parallel processingframework for large scale data. In the reduce phase, the MapReduce employs the hash scheme in order to distribute
data sharingthe same key across cluster nodes. However, this approach is fragile for the
skeweddata distribution. In this paper, we propose a
skew-tolerant
key distributionmethod for the MapReduce. The proposed method assigns keys to cluster nodes balancing their workloads. We implemented our proposed method on Hadoop. Through experiments, we evaluate the performance of the proposed method in comparison with the conventional method.
Keywords:
-
Correction
-
Source
-
Cite
-
Save
13
References
2
Citations
NaN
KQI