Ralph Bellofatto, Paul G. Crumley, et al.
IPDPS 2006
Memory fences inhibit the reordering of memory accesses in modern microprocessors; fences are useful to implement synchronization and strong shared memory semantics in multi-threaded programs. A naive implementation of memory fences can result in a significant performance penalty for processors with deep pipelines supporting multiple concurrent memory accesses. The paper compares three techniques to reduce the impact of memory fences: (1) Read-speculation allows reads that follow a fence to be issued while the fence is being processed; (2) Write-ahead additionally allows writes following a fence to proceed early; (3) Selective fences distinguish between memory accesses to thread-local and shared memory and enforce ordering only among accesses to shared memory. We evaluate and compare the effectiveness of these techniques with a simulator derived from the Pentium 4 architecture. We report data for a storage model that uses memory fences to enforce the memory semantics at monitor boundaries. © 2006 IEEE.
Ralph Bellofatto, Paul G. Crumley, et al.
IPDPS 2006
Ganesh Bikshandi, Guo Jia, et al.
PPoPP 2006
Sören Bleikertz, Carsten Vogel, et al.
ACSAC 2015
Christoph Von Praun, Harold W. Cain, et al.
ISCA 2006