Eliminating Redundancy: Ultra-compact Code Generation for Programmable Dataflow AcceleratorsPrasanth ChatarasiAlex Gateaet al.2026CGO 2026
Enabling Spill-Free Compilation via Affine-Based Live Range Reduction OptimizationPrasanth ChatarasiAlex Gateaet al.2026CGO 2026
Breaking the HBM Bit Cost Barrier: Domain-Specific ECC for AI Inference InfrastructureRui XieAsad Ul Haqet al.2025IEEE Computer Architecture Letters
MixTrain: accelerating DNN training via input mixingSarada KrithivasanSanchari Senet al.2024Frontiers in Artificial Intelligence
A Software-Assisted Peak Current Regulation Scheme to Improve Power-Limited Inference Performance in a 5nm AI SoCMonodeep KarJoel Silbermanet al.2024ISSCC 2024
Power-Limited Inference Performance Optimization Using a Software-Assisted Peak Current Regulation Scheme in a 5-nm AI SoCMonodeep KarJoel Silbermanet al.2024IEEE Journal of Solid-State Circuits
DNNDaSher: A Compiler Framework for Dataflow Compatible End-to-End Acceleration on IBM AIUSanchari SenShubham Jainet al.2024IEEE Micro
Approximate computing and the efficient machine learning expeditionJörg HenkelHai Liet al.2022ICCAD 2022
OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning AcceleratorsSubhankar PalSwagath Venkataramaniet al.2022Transactions on Embedded Computing Systems
09 Jan 2023US11551054System-aware Selective Quantization For Performance Optimized Distributed Deep Learning
29 Aug 2022US11429524Optimized Hierarchical Scratchpads For Enhanced Artificial Intelligence Accelerator Core Utilization
KEKaoutar El MaghraouiPrincipal Research Scientist and Manager, AIU Spyre Model Enablement, AI Hardware Center