Advancing Fluorescence Light Detection and Ranging in Scattering Media with a Physics-Guided Mixture-of-Experts and Evidential CriticsIsmail ErbasFerhat Demikiranet al.2025NeurIPS 2025
Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory SystemYunhua FangRui Xieet al.2025IEEE Computer Architecture Letters
Generative AI Through CAS Lens: An Integrated Overview of Algorithmic Optimizations, Architectural Advances, and Automated DesignsChuan ZhangYou Youet al.2025IEEE JESTCS
Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex OptimizationWei LiuAnweshit Pandaet al.2025TMLR
CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA InitializationYanxia DengAozhong Zhanget al.2025TMLR
COMQ: A Backpropagation-Free Algorithm for Post-Training QuantizationAozhong ZhangZi Yanget al.2025IEEE Access
Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime ImagingIsmail ErbasVikas Pandeyet al.2024NeurIPS 2024
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-ExpertsMohammed Nowaz Rabbani ChowdhuryMeng Wanget al.2024ICML 2024
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-WidthsXimeng SunRameswar Pandaet al.2024WACV 2024
13 Mar 2023CNZL201910352202.5Very Low Precision Floating Point Representation For Deep Learning Acceleration
KEKaoutar El MaghraouiPrincipal Research Scientist and Manager, AIU Spyre Model Enablement, AI Hardware Center
PCPin-Yu ChenPrincipal Research Scientist and Manager; Chief Scientist, RPI-IBM AI Research Collaboration