GeoServe: Leveraging Disaggregated Data Processing for Scalable Geospatial Model Serving

Gerard Finol; Christian Pinto

doi:10.1145/3805621.3807611

EuroMLSys 2026

Workshop paper

27 Apr 2026

GeoServe: Leveraging Disaggregated Data Processing for Scalable Geospatial Model Serving

Download paper

Abstract

Geospatial foundation models (GFMs) operate on large, multi-band raster products (e.g., GeoTIFF) that require expensive data access and preprocessing – reprojection, decoding, normalization, and tiling – before GPU inference. In our measurements, reading and preprocessing geospatial inputs can be orders of magnitude slower than tokenization or standard image preprocessing, and constitute 31 − 43% of end-to-end request time for a representative GFM. Exist- ing inference frameworks such as vLLM execute this preprocessing inline with request handling, which under load serializes CPU I/O work, increases queueing delay, and leaves GPUs underutilized. We present GeoServe, a serving system based on Ray that disaggregates the geospatial data pipeline from GPU inference by offloading I/O- and CPU-heavy preprocessing to a scalable pool of CPU workers while keeping GPU nodes dedicated to model forward passes. We show experimentally that GeoServe reduces the p90 request latency by up to 262.8× at high load and improves throughput by up to 4.89× compared to vanilla vLLM, while increasing the achieved model forward-pass rate from ∼ 16 inf./sec to ∼ 74 inf./sec via better batching opportunities.

Conference paper