SynRS3D Icon
A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

1 The University of Tokyo, 2 RIKEN AIP

NeurIPS (Spotlight), 2024

UTokyo Logo RIKEN Logo

Abstract

Global semantic 3D understanding from single-view high-resolution remote sensing (RS) imagery is crucial for Earth observation (EO). However, this task faces significant challenges due to the high costs of annotations and data collection, as well as geographically restricted data availability. To address these challenges, synthetic data offer a promising solution by being unrestricted and automatically annotatable, thus enabling the provision of large and diverse datasets. We develop a specialized synthetic data generation pipeline for EO and introduce SynRS3D, the largest synthetic RS dataset. SynRS3D comprises 69,667 high-resolution optical images that cover six different city styles worldwide and feature eight land cover types, precise height information, and building change masks. To further enhance its utility, we develop a novel multi-task unsupervised domain adaptation (UDA) method, RS3DAda, coupled with our synthetic dataset, which facilitates the RS-specific transition from synthetic to real scenarios for land cover mapping and height estimation tasks, ultimately enabling global monocular 3D semantic understanding based on synthetic data. Extensive experiments on various real-world datasets demonstrate the adaptability and effectiveness of our synthetic dataset and the proposed RS3DAda method.

Paper Overview

SynRS3D is the largest synthetic dataset for remote sensing, paired with RS3DAda, a multi-task UDA training method. The trained model can be applied to various real-world remote sensing applications.

Global 3D Semantic Reconstruction

Using RS3DAda, we trained models on purely synthetic SynRS3D data to achieve global 3D semantic reconstruction based on real-world monocular remote sensing images.

SynRS3D Construction Process

We built a procedural modeling system using Blender and Python, simulating six global urban styles, and programmatically generating RGB images, land cover annotations, and nDSM data.

Comparison & Examples & Statistics

Comparison between SynRS3D and existing synthetic remote sensing datasets across various aspects, including diversity, image capture, assets, supported tasks, scale, and size.

Examples of SynRS3D data samples, including six urban styles (RGB, land cover, height map), along with statistical comparisons to other synthetic and real datasets.

RS3DAda Framework

The RS3DAda framework is a multi-task UDA model based on a teacher-student training fashion, which is the first UDA model tailored to the remote sensing field for synthetic-to-real adaptation.

Experimental Settings

We have three experimental settings: (1) Source-only, (2) Combining SynRS3D with Real Data Scenarios, and (3) Transfer SynRS3D to Real-World Scenarios to verify RS3DAda's effectiveness.

Benchmark Results for Different Scenarios

Scenario 1: Source-only Setting

Results for Source-only Setting, highlighting the baseline performance using SynRS3D dataset.

Scenario 2: SynRS3D with Real Data Scenarios

Results for Combining SynRS3D with Real Data Scenarios, showcasing the improvements made by using mixed datasets.

Scenario 3: Transfer to Real-World Scenarios

Results for Transfer SynRS3D to Real-World Scenarios, validating the effectiveness of RS3DAda for real-world applications.

Qualitative Results

Qualitative results showing left: height estimation, right: land cover mapping tasks.

Extended Earth Observation Applications of SynRS3D

Application 1: Building Change Detection

SynRS3D supports the detection of building changes over time, helping to monitor urban development and environmental changes effectively.

Application 2: Disaster Mapping

SynRS3D plays a crucial role in disaster mapping, providing timely and accurate data for assessing damages after natural disasters like earthquakes and cyclones.

BibTeX

      @article{song2024synrs3d,
      title={SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery},
      author={Song, Jian and Chen, Hongruixuan and Xuan, Weihao and Xia, Junshi and Yokoya, Naoto},
      journal={arXiv preprint arXiv:2406.18151},
      year={2024}
      }

      @inproceedings{song2024syntheworld,
      title={SyntheWorld: A Large-Scale Synthetic Dataset for Land Cover Mapping and Building Change Detection},
      author={Song, Jian and Chen, Hongruixuan and Yokoya, Naoto},
      booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
      pages={8287--8296},
      year={2024}
      }
    


© Jian Song | Last updated: Dec 8, 2024