03 Nov 2025

Deep Dive

6 minute read

Leading the way in open Earth observation AI

Understanding our planet has never been more critical. From insurance and finance to logistics and agriculture, countless industries need reliable insights about how the Earth behaves. Foundational AI is opening unprecedented possibilities for how we observe and measure our world — and IBM Research, together with NASA, the European Space Agency and other partners — is at the forefront of this transformation.

Since we first started working on geospatial AI with NASA in 2023, we've built a comprehensive ecosystem of industry-leading models, tools, and benchmarks that help us understand our land, our seas, and beyond. These AI models have been downloaded more than 600,000 times by the community, demonstrating both their quality and utility.

Building models that push the state of the art

Earth observation (EO) differs fundamentally from other computer vision (CV) problems. Unlike tasks such as reading credit card characters or detecting people in images, RGB (red, green, blue) data alone cannot meet the complex needs of agriculture, environmental monitoring, or disaster response. That's why we've focused on innovations that address the unique demands of EO data, advancing the state-of-the-art with large-scale vision transformers, generative multimodal learning, and efficient neural compression algorithms.

All the models we have released use images for a given location and a given time as an abstraction of the physical world (spectral images from an optical satellite, radar images from a Synthetic Aperture Radar is a type of active remote sensing that uses radar to create high-resolution images of the Earth's surface.SAR instrument, images from a re-analysis weather data set or high-resolution images of the Sun at a particular wavelength). Using embeddings as representations, we apply different pre-training algorithms with optimized pre-text tasks (such as masking and reconstruction, time-advancement forecasting, and correlation learning) to create a new abstraction of the physical world, which we call the foundational model. As we move forward, we will discover new abstractions and new representations to drive algorithmic innovations for our Earth and our Space, continuing the journey we started 3 years ago with pioneering work.

Prithvi-EO, co-developed with NASA in 2023, pioneered the first large-scale application of vision transformers (ViTs) for multi-temporal EO data. We’ve since released Prithvi-EO-2.0 featuring deeper metadata understanding and stronger temporal capabilities for even greater performance. This work was complemented by Prithvi-WxC, which was the first foundation model for weather and climate data, spanning both global and regional contexts and supporting both forecasting and zero-lead-time inference for tasks such as downscaling.

TerraMind was released in 2025 in collaboration with the European Space Agency (ESA) and Jülich Supercomputing Centre (JSC). It introduced several firsts to the field, including multi-modal correlation learning and multi-modal generative capabilities that can be directly used without tokenization – making it the first EO foundation model to feature "Thinking in Modalities," a novel self-improvement fine-tuning algorithm that leverages self-generated data to enhance performance. Presented at ICCV, TerraMind has over 10,000 weekly downloads on Hugging Face and is the only EO model family that offers flexible processing across data sources (including sparse data) and supports zero-shot, few-shot, fine-tuning, and generative approaches. Today, TerraMind leads the PANGAEA and GEO-Bench-2 benchmarks for EO-specific tasks.

As these models have matured, we also turned our focus toward accessibility and efficiency, ensuring their capabilities can extend beyond large-scale infrastructure to the edge. We recently released “small” and “tiny” versions of TerraMind and Prithvi-EO-2.0 that maintain performance comparable to their larger predecessors while remaining lightweight enough to run on satellites, smartphones, and other edge devices — an essential step for space-based and resource-constrained usage.

Building on our work to make models smaller and more efficient, we’ve also turned to compressing the data itself. TerraCodec represents the first large-scale, end-to-end effort in neural compression for EO data, achieving up to 10 times more efficient compression than popular codecs like JPEG 2000 or HEVC at equal image quality. Trained on Sentinel-2 data, this family of learned codecs includes lightweight multispectral models ranging from 1M to 10M parameters and a temporal transformer that models seasonal dependencies. All models and code are released under a permissive open-source license.

Extending beyond Earth, we collaborated with NASA and eight other research centers to develop Surya, the first foundation model for solar physics. We created Surya to understand the effects of the Sun and its cascading impacts to life and infrastructure on Earth. Solar outbursts can disrupt satellites, power grids, and communications systems. Alongside the model, we released SuryaBench, the leading benchmark dataset for machine learning in heliophysics and space weather prediction.

A commitment to openness and scientific rigor — beyond computer vision

Our commitment to openness goes further than many others in the field, fully aligning with the open science philosophies of NASA and ESA. While leading computer vision models like Meta's DinoV3 are trained on proprietary datasets, we train our models on fully open-access sources, including HLS and Sentinel data. To further reduce the burden of data preparation on the community, we released TerraMesh — the largest AI-ready, multi-modal open EO dataset, used to develop TerraMind and featured at CVPR.

Our code, models, and datasets are released under permissive licenses, such as Apache 2.0. By contrast, foundation models such as AlphaEarth Foundations in Google Earth AI remain bound to a proprietary environment — efficient within Google’s ecosystem, but as a closed-source model with limited flexibility for customization, real time use (in some cases only yearly embeddings are available) or in-depth validation on public benchmarks.

This commitment extends across everything we build. Our models are co-developed and validated alongside domain scientists from NASA, ESA, and the broader community. This science-led approach embeds deep domain expertise within our models, ensures rigorous evaluation, and creates feedback loops that address scientific requirements. A purely computer-vision driven approach often falls short in applications that demand a nuanced understanding of EO-specific data.

We saw this clearly with the release of GEO-Bench-V2. General-purpose CV models such as Meta’s DinoV3 perform strongly in categories that are closer to the vision domain (RGB/NIR and high-resolution imagery tasks) because of the nature of the data they’ve been trained on. In contrast, TerraMind and Prithvi dominate the EO-specific categories, particularly those requiring multispectral and multi-temporal data for domains like agriculture, environmental monitoring, and disaster response.

Benchmarking two segmentation tasks using RGB or multispectral data.

To illustrate this difference, we ran an ablation study fine-tuning Prithvi and TerraMind using only RGB data versus all available multispectral bands for two representative tasks: crop segmentation and burn scars detection. The RGB-only models showed performance drops of up to 25%, highlighting the importance of multispectral understanding in EO models.

Building an ecosystem for impact

Models alone don't create impact — they need robust supporting infrastructure. We've built a comprehensive toolkit that makes geospatial AI accessible, reproducible, and production-ready, all released openly under the Apache 2.0 license.

TerraTorch is the first dedicated fine-tuning and deployment library for EO foundation models. It has become the most widely used tool in the community for customizing and experimenting with open-source geospatial foundation models — not just our own, but those developed across the field.

We’ve also extended vLLM to support non-text inputs and outputs, enabling high-performance inference for geospatial AI models serving many concurrent users. Prithvi-EO-2.0 was the first non-text input to non-text output model ever onboarded to vLLM, and we're now expanding vLLM to support fully multi-modal workflows across all post-processing stages.

Community benchmarks continue to drive progress forward. We co-developed GEO-Bench-2, an AI Alliance initiative which goes beyond performance metrics to evaluate model capabilities across 19 different geospatial datasets. As noted earlier, TerraMind currently leadsthe GEO-Bench-2 leaderboard in three key capability areas. Together with partners, we also introduced NeuCo-bench, which we presented at CVPR. It’s the first benchmark focusing on compressed and small embeddings for EO tasks.

Finally, beyond tooling, we have released all models on Hugging Face alongside high-quality, open datasets that allows developers to drive the field forward through benchmarking, reproducibility, and effective collaboration. We've also maintained strong community engagement through tutorials, workshops and seminars at major events over the past few years.

Real-world applications across the globe

Our models and tools are already being applied to real-world challenges. Prithvi-EO was the first foundation model to be adapted to a specific geography through continuous pre-training focused on UK and Ireland. Featured at ICLR, the model improved local use cases such as flood detection. Prithvi-EO also was used during Kenya's 2024 flooding disaster.

Prithvi also supports Kenya's country-wide reforestation effort. As President William Ruto's spokesperson, Hussein Mohamed, noted: "Through our partnership with IBM, we have the capability of harnessing the power of artificial intelligence and geospatial data to advance our climate ambitions."

This is confirmed by the work that IBM and the Heat and Health African Transdisciplinary Center (HE2AT Center) carried out. In fact, Prithvi was adapted to understand how heat islands form in South Africa, using land-surface temperature variations from satellite data.

This is just the beginning. Our teams are exploring how these foundation models can be applied to other domains that help us better understand the ecosystem in which we live. For example, we recently worked with the Plymouth Marine Lab, the UK’s STFC Hartree Centre, and University of Exeter, where we took Prithvi’s architecture and applied it to a new realm: our oceans. This led to the creation of Granite-Geospatial-Ocean model, a model that can be used to monitor the health of marine ecosystems and the oceans' uptake of carbon.

We’re working to help spur on even more discoveries based on our models. Working with ESA’s Φ-Lab, we launched the TerraMind Blue-Sky Challenge. This is an ongoing competition to find novel geospatial applications of the model. So far, it’s already been successfully adapted for flood prediction, ship detection, and ecosystem degradation in several proposals.

Setting the standard for Earth observation AI

We're combining advanced capabilities with practical usability, scientific rigor with openness, and cutting-edge research with real-world validation. Through our collaboration with NASA, ESA, supercomputing centers like JSC, and the broader scientific community, we’ve developed models that deliver leading performance on community benchmarks and remain operational on edge devices.

By innovating across the stack — from multimodal and multi-temporal architectures to compression techniques, fine-tuning tools, and scalable serving frameworks — we're enabling new ways to monitor, measure, and care for our planet. These tools provide insights that support critical decision-making – from disaster response to wildlife protection, urban heat detection, crop monitoring, and countless other applications.

Everything we build is open because we believe real progress happens through collaboration, transparency, and shared knowledge. This approach is helping set a new standard for Earth observation AI, lowering barriers to entry, and empowering industries, scientists and communities and a moment when understanding our planet is becoming more critical every day.

Subscribe to our Future Forward newsletter and stay up to date on the latest research news

Subscribe to our newsletter

Notes

Note 1: Synthetic Aperture Radar is a type of active remote sensing that uses radar to create high-resolution images of the Earth's surface. ↩︎

Boost your agents: Introducing ALTK, the open-source agent lifecycle toolkit
Technical note
Kiran Kate, Jim Laredo, Vinod Muthusamy, Jason Tsay, Yara Rizk, and Zidane Wright
29 Oct 2025
- AI
An artist’s tribute to modern AI
Q & A
Kim Martineau
27 Oct 2025
Expanding AI model training and inference for the open-source community
News
Peter Hess
21 Oct 2025
Introducing Thinking-in-Modalities with TerraMind
Technical note
Benedikt Blumenstiel and Johannes Jakubik
20 Oct 2025
- AI
- Physical Sciences

Building models that push the state of the art

A commitment to openness and scientific rigor — beyond computer vision

Building an ecosystem for impact

Real-world applications across the globe

Setting the standard for Earth observation AI

Notes

Related posts

Boost your agents: Introducing ALTK, the open-source agent lifecycle toolkit

An artist’s tribute to modern AI

Expanding AI model training and inference for the open-source community

Introducing Thinking-in-Modalities with TerraMind