Short course

Hardware Accelerator Design for AI: Enabling Generative Models

Abstract

The advent of large language models and generative AI has ushered enormous demand for hardware accelerators to perform AI training, fine-tuning, and inference. The design of such accelerators depends on holistic optimization of technology, circuits, and systems, but also fundamentally upon the models and use cases that this hardware needs to serve. Achieving the proper balance of compute vs. communication to optimize latency and throughput in AI workloads will require tradeoffs across the hardware/software stack to reconcile the long development cycles needed to build chips and systems with the torrid pace of innovation in AI models and algorithms. This talk will provide an overview of the landscape for AI hardware accelerators and discuss research roadmaps to improve both compute efficiency and communication bandwidth, particularly as Generative AI evolves towards Agentic AI and smaller, fit-for-purpose models.