Erich P. Stuntebeck, John S. Davis II, et al.
HotMobile 2008
We describe a representation of instruction-level parallelism which does not require checking dependencies at run-time, and which is suitable for processor implementations with varying issue-width. In this approach, a program is represented as a sequence of tree-instructions, each containing multiple primitive operations. These tree-instructions are executable either in one or multiple cycles, do not require a specific processor organization, and are generated assuming an implementation capable of large instruction-level parallelism. During instruction cache reloading/accessing, tree-instructions are decomposed into subtrees which fit the actual resources available in an implementation. The resulting subtrees require a simple instruction-dispatch mechanism, as in the case of statically scheduled processors. The representation makes practical the use of the same parallelized code in implementations with different issue-width; simulation results indicate that the instruction-level parallelism achievable with this approach degrades less than 10% with respect to code compiled for each specific implementation.
Erich P. Stuntebeck, John S. Davis II, et al.
HotMobile 2008
Pradip Bose
VTS 1998
Raymond Wu, Jie Lu
ITA Conference 2007
Ehud Altman, Kenneth R. Brown, et al.
PRX Quantum