Towards Confidential AI for the Masses!

Julian James Stephen; Michael Le

OSSNA 2025

Invited talk

23 Jun 2025

Towards Confidential AI for the Masses!

Abstract

Confidential AI leveraging GPUs can bring AI to the masses without sacrificing the privacy of end users. Individual open source technologies already exist to configure, deploy, and manage confidential TEEs. However, clobbering a multitude of components into a coherent, secure, and efficient solution is challenging with many pitfalls. For example, depending on use cases and involved parties (cloud/model/service owners), attestation and key management methodology can vary drastically. In addition, for TEEs with confidential GPUs, complexity extends to increased load times, affecting services that serve multiple models.

This talk will go through key components and design decisions needed to enable confidential AI. Specifically: i) implications of different trust models on the solution and (ii) performance tradeoff considerations. To concretize the discussion, we will present a detailed end-to-end 'how to', for deploying an inference service on Nvidia H100 GPUs and AMD-based TEE with a focus on protecting the model and the user input. The audience will be able to appreciate why there can be no one size fit all confidential AI solution and understand what design works for them.

Conference paper