Eric A. Joseph
AVS 2023
To achieve system-level benefits, compute-in-memory tiles need to be integrated into heterogeneous architectures alongside general and application-specific digital compute cores, together with a high-bandwidth and reconfigurable on-chip routing fabric that can deliver the right vectors to the right locations for just-in-time DNN compute. In the first part of my talk, I will review some of IBM’s work in developing weight-stationary analog compute cores with a focus on the design choices and optimizations for high tile efficiency. I will then provide a brief introduction to heterogeneous architectures for CIM systems followed by architectural studies of DNNs identifying auxiliary operations that bottleneck the performance. Finally, I will highlight the issue of achieving true weight-stationarity in large models such as Mixture-of-Expert (MoE) Transformer models, and the system-level benefits that such an architecture can achieve.
Eric A. Joseph
AVS 2023
Stefano Ambrogio
MRS Spring Meeting 2022
David Stutz, Nandhini Chandramoorthy, et al.
MLSys 2021
Juan Miguel De Haro, Rubén Cano, et al.
IPDPS 2022