A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware ThrottlingSae Kyu LeeAnkur Agrawalet al.2021IEEE JSSC
4-bit quantization of LSTM-based speech recognition modelsAndrea FasoliChia-Yu Chenet al.2021INTERSPEECH 2021
Hardware-Aware Neural Architecture Search: Survey and TaxonomyHadjer BenmezianeKaoutar El Maghraouiet al.2021IJCAI 2021
RaPiD: AI Accelerator for Ultra-Low Precision Training and InferenceSwagath VenkataramaniVijayalakshmi Srinivasanet al.2021ISCA 2021
A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware ThrottlingAnkur AgrawalSaekyu Leeet al.2021ISSCC 2021
ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training Chia-Yu ChenJiamin Niet al.2020NeurIPS 2020
Efficient AI System Design with Cross-Layer Approximate ComputingSwagath VenkataramaniXiao Sunet al.2020Proceedings of the IEEE
A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and InferenceJinwook OhSae Kyu Leeet al.2020VLSI Circuits 2020
Hybrid 8-bit floating point (HFP8) training and inference for deep neural networksXiao SunJungwook Choiet al.2019NeurIPS 2019
03 Mar 2025US12240753Micro-electromechanical Device Having A Soft Magnetic Material Electrolessly Deposited On A Palladium Layer Coated Metal Beam
23 Dec 2024US12175359Machine Learning Hardware Having Reduced Precision parameter Components For Efficient Parameter Update
21 Jul 2024JP7525237Machine Learning Hardware Having Reduced Precision Parameter Components For Efficient Parameter Update
KEKaoutar El MaghraouiPrincipal Research Scientist and Manager, AIU Spyre Model Enablement, AI Hardware Center
PCPin-Yu ChenPrincipal Research Scientist and Manager; Chief Scientist, RPI-IBM AI Research Collaboration