Jian Lou, Ashish Jagmohan, et al.
ICME 2007
Emerging multi-core processors are able to accelerate medical imaging applications by exploiting the parallelism available in their algorithms. We have implemented a mutual-information-based 3D linear registration algorithm on the Cell Broadband Engine™ processor. By exploiting the highly parallel architecture and its high memory bandwidth, our implementation with two CBE processors can register a pair of 256×256×30 3D images in one second. This implementation is significantly faster than a conventional one on a traditional microprocessor or even faster than a previously reported custom-hardware implementation. In addition to parallelizing the code for multiple cores and organizing the data structure for reducing the amount of the memory traffic, it is also critical to optimize the code for the SIMD pipeline structure. We note that code optimization for the SIMD pipeline alone results in a 4.2×-8.7× acceleration for the computation of small kernels. Further, SIMD optimization alone results in a 4.5× end-end application speedup. ©2007 IEEE.
Jian Lou, Ashish Jagmohan, et al.
ICME 2007
Hangu Yeo, Yu Hen Hu
IEEE TCSVT
Hangu Yeo, Catherine Crawford
Big Data 2015
Liu Yan, Yang Jun, et al.
ICME 2007