Efficient AI System Design with Cross-Layer Approximate ComputingSwagath VenkataramaniXiao Sunet al.2020Proceedings of the IEEE
A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and InferenceJinwook OhSae Kyu Leeet al.2020VLSI Circuits 2020
DyVEDeep: Dynamic Variable Effort Deep Neural NetworksSanjay GanapathySwagath Venkataramaniet al.2020ACM TECS
Hybrid 8-bit floating point (HFP8) training and inference for deep neural networksXiao SunJungwook Choiet al.2019NeurIPS 2019
Memory and Interconnect Optimizations for Peta-Scale Deep Learning SystemsSwagath VenkataramaniVijayalakshmi Srinivasanet al.2019HiPC 2019
Performance-driven Programming of Multi-TFLOP Deep Learning Accelerators∗Swagath VenkataramaniJungwook Choiet al.2019IISWC 2019
DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI AcceleratorSwagath VenkataramaniJungwook Choiet al.2019IEEE Micro
Dynamic Spike Bundling for Energy-Efficient Spiking Neural NetworksSarada KrithivasanSanchari Senet al.2019ISLPED 2019
BiScaled-DNN: Quantizing long-tailed datastructures with two scale factors for deep neural networksShubham JainSwagath Venkataramaniet al.2019DAC 2019
09 Jan 2023US11551054System-aware Selective Quantization For Performance Optimized Distributed Deep Learning
29 Aug 2022US11429524Optimized Hierarchical Scratchpads For Enhanced Artificial Intelligence Accelerator Core Utilization
KEKaoutar El MaghraouiPrincipal Research Scientist and Manager, AIU Spyre Model Enablement, AI Hardware Center