One-dependent cycles and passage times in stochastic Petri nets
P.J. Haas, G.S. Shedler
PNPM 1995
We compare empirically the cost of estimating the selectivity of a star join using the sampling-based t_cross procedure to the cost of computing the join and obtaining the exact answer. The relative cost of sampling can be excessive when a join attribute value exhibits 'heterogeneous skew.' To alleviate this problem, we propose Algorithm TCM, a modified version of t_cross that incorporates 'augmented frequent value' (AFV) statistics. We provide a sampling-based method for estimating AFV statistics that does not require indexes on attribute values, requires only one pass though each relation, and uses an amount of memory much smaller than the size of a relation. Our experiments show that the use of estimated AFV statistics can reduce the relative cost of sampling by orders of magnitude. We also show that use of estimated AFV statistics can reduce the relative error of the classical System R selectivity formula.
P.J. Haas, G.S. Shedler
PNPM 1995
C. Mohan
ICDE 1995
V. Markl, P.J. Haas, et al.
VLDB Journal
Arun Swami, Honesty C. Young
RIDE 1992