Demo paper

VP Lab: a PEFT-Enabled Visual Prompting Laboratory for Semantic Segmentation

Abstract

In this demo, we present VP Lab (Visual Prompting Lab), a comprehensive interactive framework that enhances visual prompting model development for robust domain-specific segmentation. At the core of VP Lab lies E-PEFT, our novel parameter-efficient fine-tuning technique designed to adapt visual prompting pipelines to specialized domains with minimal parameter updates and limited labeled data. Our approach surpasses state-of-the-art parameter-efficient fine-tuning methods for the Segment Anything Model (SAM), enabling an interactive, near-real-time development loop that allows users to progressively improve visual prompting results even in challenging domains as they interact with the framework/model. By integrating E-PEFT with visual prompting, we demonstrate a substantial 50% improvement in semantic segmentation mIoU performance across various technical datasets using only five validated images.

We showcase VP Lab's capabilities by presenting an interactive workflow applied to a real-world medical use case. The demonstration follows a three-step process: (1) The user begins by prompting an initial image randomly sampled from the dataset, which serves as input for the visual prompting pipeline, generating predictions across the entire dataset. (2) When these initial predictions prove insufficient, the user makes minimal refinements using VP Lab's annotation tool. These refined annotations then tune the model via E-PEFT, producing significantly improved results in less than a minute. (3) A second brief round of label refinement and fine-tuning leads to near-perfect segmentation results, demonstrating the efficiency and effectiveness of our iterative approach.