Case studies in hardware XPath acceleration
Dorit Nuzman, David Maze, et al.
SYSTOR 2011
Most implementations of the Single Instruction Multiple Data (SIMD) model available today require that data elements be packed in vector registers. Operations on disjoint vector elements are not supported directly and require explicit data reorganization manipulations. Computations on non-contiguous and especially interleaved data appear in important applications, which can greatly benefit from SIMD instructions once the data is reorganized properly. Vectorizing such computations efficiently is therefore an ambitious challenge for both programmers and vectorizing compilers. We demonstrate an automatic compilation scheme that supports effective vectorization in the presence of interleaved data with constant strides that are powers of 2, facilitating data reorganization. We demonstrate how our vectorization scheme applies to dominant SIMD architectures, and present experimental results on a wide range of key kernels, showing speedups in execution time up to 3.7 for interleaving levels (stride) as high as 8. Copyright © 2006 ACM.
Dorit Nuzman, David Maze, et al.
SYSTOR 2011
Alfred J Park, Cheng-Hong Li, et al.
HOTI 2010
Alfred J Park, Cheng-Hong Li, et al.
SIMULATION
Erven Rohou, Sergei Dyshel, et al.
HiPEAC 2011