Effect of skew on join performance in parallel architectures
M.Seetha Lakshmi, Philip S. Yu
DPDS 1987
Semijoin has traditionally been relied upon for reducing the communication cost required for distributed query processing. However, judiciously applying join operations as reducers can lead to further reduction in the communication cost. In view of this fact, the approach of using join operations, in addition to semijoins, as reducers in distributed query processing is explored. It is first shown that the problem of determining a sequence of join operations for a query graph can be transformed to that of finding a set of cuts to that graph, where a cut to a graph is a partition of the nodes in that graph. In light of the mapping, an efficient heuristic algorithm to determine an effective sequence of join reducers for a query is developed. The algorithm, using the concept of divide-and-conquer, is shown to have polynomial time complexity. Examples are given to illustrate the results.
M.Seetha Lakshmi, Philip S. Yu
DPDS 1987
F.J. Budinsky, M.A. Finnie, et al.
IBM Systems Journal
M.Seetha Lakshmi, Philip S. Yu
ICDE 1989
Haixun Wang, Chang-Shing Perng, et al.
CSB 2002