Corey Liam Lammie, Hadjer Benmeziane, et al.
Nat. Rev. Electr. Eng.
Analog Non-Volatile Memory-based accelerators offer high-throughput and energy-efficient Multiply-Accumulate operations for the large Fully-Connected layers that dominate Transformer-based Large Language Models. We describe architectural, wafer-scale testing, chip-demo, and hardware-aware training efforts towards such accelerators, and quantify the unique raw-throughput and latency benefits of Fully- (rather than Partially-) Weight-Stationary systems.
Corey Liam Lammie, Hadjer Benmeziane, et al.
Nat. Rev. Electr. Eng.
Olivier Maher, N. Harnack, et al.
DRC 2023
Sufi Zafar, Thomas Picunko, et al.
IEDM 2023
Eunjung Cha, Alberto Ferraris, et al.
IEDM 2023