Naigang Wang

Title

RSM, Manager, AI acceleration algorithm and framework

Publications

Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging
- - Ismail Erbas
  - Vikas Pandey
  - et al.
- 2024
- NeurIPS 2024
Workshop paper
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
- - Mohammed Nowaz Rabbani Chowdhury
  - Meng Wang
  - et al.
- 2024
- ICML 2024
Conference paper
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths
- - Ximeng Sun
  - Rameswar Panda
  - et al.
- 2024
- WACV 2024
Conference paper
Deep Compression of Pre-trained Transformer Models
- - Naigang Wang
  - Chi-Chun Liu
  - et al.
- 2022
- NeurIPS 2022
Conference paper
A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling
- - Sae Kyu Lee
  - Ankur Agrawal
  - et al.
- 2021
- IEEE JSSC
Paper
4-bit quantization of LSTM-based speech recognition models
- - Andrea Fasoli
  - Chia-Yu Chen
  - et al.
- 2021
- INTERSPEECH 2021
Conference paper
Hardware-Aware Neural Architecture Search: Survey and Taxonomy
- - Hadjer Benmeziane
  - Kaoutar El Maghraoui
  - et al.
- 2021
- IJCAI 2021
Survey paper
RaPiD: AI Accelerator for Ultra-Low Precision Training and Inference
- - Swagath Venkataramani
  - Vijayalakshmi Srinivasan
  - et al.
- 2021
- ISCA 2021
Conference paper
A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling
- - Ankur Agrawal
  - Saekyu Lee
  - et al.
- 2021
- ISSCC 2021
Conference paper
Ultra-Low Precision 4-bit Training of Deep Neural Networks
- - Xiao Sun
  - Naigang Wang
  - et al.
- 2020
- NeurIPS 2020
Conference paper

Blog posts

Ultra-low-precision training of deep neural networks
Technical note
Naigang Wang
09 May 2019
- AI
8-bit precision for training deep learning systems
Research
Naigang Wang
03 Dec 2018
- AI
- AI Hardware

Top collaborators

Derrick Liu

Software Developer

Kaoutar El Maghraoui

Principal Research Scientist, AIU Spyre Software Ecosystem, AI Hardware Center

Raghu Kiran Ganti

Distinguished Engineer

Mudhakar Srivatsa

Distinguished Engineer, AI Platform

Naigang Wang

Title

Publications

Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging

A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts

Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths

Deep Compression of Pre-trained Transformer Models

A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling

4-bit quantization of LSTM-based speech recognition models

Hardware-Aware Neural Architecture Search: Survey and Taxonomy

RaPiD: AI Accelerator for Ultra-Low Precision Training and Inference

A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling

Ultra-Low Precision 4-bit Training of Deep Neural Networks

Patents

Slab Inductor Device Providing Efficient On-chip Supply Voltage Conversion And Regulation

Forming Magnetic Microelectromechanical Inductive Components

Silicon Process Compatible Trench Magnetic Device

Slab Inductor Device Providing Efficient On-chip Supply Voltage Conversion And Regulation

Slab Inductor Device Providing Efficient On-chip Supply Voltage Conversion And Regulation

Forming Magnetic Microelectromechanical Inductive Components

Inductor With Stacked Conductors

Electroless Plated Material Formed Directly On Metal

Electroless Plating Of Cobalt Alloys For On Chip Inductors

Inductor With Laminated Yoke

Blog posts

Ultra-low-precision training of deep neural networks

8-bit precision for training deep learning systems

Top collaborators

Derrick Liu

Kaoutar El Maghraoui

Raghu Kiran Ganti

Mudhakar Srivatsa