Managing healthcare data hippocratically
Rakesh Agrawal, Ameet Kini, et al.
SIGMOD 2004
We consider the problem of parallelizing high-dimensional proximity joins. We present a parallel multidimensional join algorithm based on an the epsilon-kdB tree and compare it with the more common approach of space partitioning. An evaluation of the algorithms on an IBM SP2 shared-nothing multiprocessor is presented using both synthetic and real-life datasets. We also examine the effectiveness of the algorithms in the context of a specific data-mining problem, that of finding similar time-series. The empirical results show that our algorithm exhibits good performance and scalability, as well an ability to handle data-skew.
Rakesh Agrawal, Ameet Kini, et al.
SIGMOD 2004
Rakesh Agrawal, Christopher Johnson, et al.
ICDE 2006
Rakesh Agrawal
PKDD 2004
John C. Shafer, Rakesh Agrawal
Computer Networks