Robert G. Farrell, Catalina M. Danis, et al.
RecSys 2012
We propose a model, LPRAM, for parallel random access machines with local memory that captures both the communication and computational requirements in parallel computation. For this model, we present several interesting results, including the following:. Two n × n matrices can be multiplied in 0(n3/p) computation time and 0(n2/p 2 3) communication steps using p processors (for p = 0(n3/log 3 2 n)). Furthermore, these bounds are optimal for arithmetic on semirings (using +, × only). It is shown that any algorithm that uses comparisons only and that sorts n words requires Ω(n log n/(p log(n/p))) communication steps for 1 < p < n. We also provide an algorithm that sorts n words and uses (-)(n log n/p) computation time and (-)(n log n/p log(n/p))) communication steps. These bounds also apply for computing an n-point FFT graph. It is shown that computing any binary tree τ with n nodes and height h requires Ω(n/p + log n + √h) communication steps, and can always be computed in 0(n/p + min(√n, h)) steps. We also present a simple linear-time algorithm that generates a schedule for computing τ in at most 2Dopt(τ) steps, where Dopt(τ) represents the minimum communication delay for computing τ. It is also shown that various problems that are expressed as DAGs exhibit a communication-delay/computation-time trade-off. © 1990.
Robert G. Farrell, Catalina M. Danis, et al.
RecSys 2012
Yvonne Anne Pignolet, Stefan Schmid, et al.
Discrete Mathematics and Theoretical Computer Science
A. Gupta, R. Gross, et al.
SPIE Advances in Semiconductors and Superconductors 1990
Sabine Deligne, Ellen Eide, et al.
INTERSPEECH - Eurospeech 2001