Collecting address traces from parallel computers
Craig B. Stunkel, Bob Janssens, et al.
HICSS 1991
Very Long Instruction World (VLIW) architectures can enhance performance by exploiting fine-grained instruction level parallelism. In this paper, we describe a compiler assisted multiple instruction word retry scheme for VLIW architectures. A read buffer is used to resolve the more frequent on-path hazards, while the compiler resolves the remaining branch hazards. Performance evaluation is described for 11 benchmark programs based on the IBM VLIW research compiler, Chameleon. Experimental results indicate that, for a VLIW machine with P functional units to rollback N instruction words, a read buffer of 2N P entries with the compiler assist can be an effective approach in producing low overhead runtime performance and small code growth, for P = 4, 8, 12, and 16 and N ≤ 3.
Craig B. Stunkel, Bob Janssens, et al.
HICSS 1991
Kun-Lung Wu, Shyh-Kwei Chen, et al.
CIKM 2004
Kun-Lung Wu, Shyh-Kwei Chen, et al.
SUTC 2006
Michail Vlachos, Kun-Lung Wu, et al.
Data Mining and Knowledge Discovery