darts_acquisition.s2.gee_scene
¶
Sentinel-2 related data loading. Should be used temporary and maybe moved to the acquisition package.
GEEStoreManager
¶
Bases: darts_acquisition.s2.raw_data_store.StoreManager[ee.Image]
Raw Data Store manager for GEE.
Initialize the store manager.
Parameters:
-
store(str | pathlib.Path | None) –Directory path for storing raw sentinel 2 data
-
bands_mapping(dict[str, str]) –A mapping from bands to obtain.
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
store
instance-attribute
¶
store = (
pathlib.Path(
darts_acquisition.s2.raw_data_store.StoreManager(
store
)
)
if isinstance(
darts_acquisition.s2.raw_data_store.StoreManager(
store
),
str,
)
else darts_acquisition.s2.raw_data_store.StoreManager(
store
)
)
complete
¶
download_and_store
¶
download_and_store(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
)
Download a scene from the source and store it in the local store.
Store must be provided! Will do nothing if all required bands are already present.
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open.
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
download_scene_from_source
¶
Download a Sentinel-2 scene from GEE.
Parameters:
-
s2item(str | ee.Image) –The Sentinel-2 image ID or the corresponding ee.Image.
-
bands(list[str]) –List of bands to download.
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
encodings
¶
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
exists
¶
Check if a scene already exists in the local raw data store.
Parameters:
-
identifier(str) –Unique identifier for the scene
Returns:
-
bool(bool) –True if the scene exists in the store, False otherwise
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
identifier
¶
load
¶
load(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
force: bool = False,
) -> xarray.Dataset
Load a scene.
If force==True will download the scene from source even if present in store.
Else, will try to open the scene from store first and only download missing bands.
Will always store the downloaded scene in local store if store is provided, potentially overwriting existing.
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open.
-
force(bool, default:False) –If True, will download the scene even if present. Defaults to False.
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
missing_bands
¶
Get the list of missing bands for a scene in the store.
Parameters:
-
identifier(str) –Unique identifier for the scene
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
open
¶
open(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
) -> xarray.Dataset
Open a scene from local store.
Store must be provided and the scene must be present in store!
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
save_to_store
¶
Save a scene dataset to the local raw data store.
Will append new bands to existing store if scene already exists. Will overwrite existing bands in an existing store if scene already exists.
Parameters:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
StoreManager
¶
Bases: abc.ABC, typing.Generic[darts_acquisition.s2.raw_data_store.SceneItem]
Manager for storing raw sentinel 2 data.
This class is an abstract base class and should be extended to implement the respective downloading methods.
Usage:
1. "Normal" usage:
```python
store_manager = StoreManager(store_path)
ds_s2 = store_manager.load(identifier, bands)
```
2. Force download:
```python
store_manager = StoreManager(store_path)
ds_s2 = store_manager.load(identifier, force=True)
```
3. Download only (and only if missing) and store the scene:
```python
store_manager = StoreManager(store_path)
store_manager.download(identifier) # store_path must be not None
```
4. Offline mode:
```python
store_manager = StoreManager(store_path)
store_manager.open(identifier) # store_path must be not None, bands must be complete
```
Initialize the store manager.
Parameters:
-
bands(list[str]) –List of bands to manage
-
store(str | pathlib.Path | None, default:None) –Directory path for storing raw sentinel 2 data
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
store
instance-attribute
¶
store = (
pathlib.Path(
darts_acquisition.s2.raw_data_store.StoreManager(
store
)
)
if isinstance(
darts_acquisition.s2.raw_data_store.StoreManager(
store
),
str,
)
else darts_acquisition.s2.raw_data_store.StoreManager(
store
)
)
complete
¶
download_and_store
¶
download_and_store(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
)
Download a scene from the source and store it in the local store.
Store must be provided! Will do nothing if all required bands are already present.
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open.
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
download_scene_from_source
abstractmethod
¶
download_scene_from_source(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
bands: list[str],
) -> xarray.Dataset
encodings
abstractmethod
¶
exists
¶
Check if a scene already exists in the local raw data store.
Parameters:
-
identifier(str) –Unique identifier for the scene
Returns:
-
bool(bool) –True if the scene exists in the store, False otherwise
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
identifier
abstractmethod
¶
identifier(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
) -> str
load
¶
load(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
force: bool = False,
) -> xarray.Dataset
Load a scene.
If force==True will download the scene from source even if present in store.
Else, will try to open the scene from store first and only download missing bands.
Will always store the downloaded scene in local store if store is provided, potentially overwriting existing.
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open.
-
force(bool, default:False) –If True, will download the scene even if present. Defaults to False.
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
missing_bands
¶
Get the list of missing bands for a scene in the store.
Parameters:
-
identifier(str) –Unique identifier for the scene
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
open
¶
open(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
) -> xarray.Dataset
Open a scene from local store.
Store must be provided and the scene must be present in store!
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
save_to_store
¶
Save a scene dataset to the local raw data store.
Will append new bands to existing store if scene already exists. Will overwrite existing bands in an existing store if scene already exists.
Parameters:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
_get_band_mapping
¶
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
convert_masks
¶
Convert the Sentinel-2 scl mask into our own mask format inplace.
https://sentiwiki.copernicus.eu/web/s2-processing#S2Processing-ClassificationMaskGeneration
Invalid: S2 SCL → 0,1 Low Quality S2: S2 SCL != 0,1 → 3,8,9,11 High Quality: S2 SCL != 0,1,3,8,9,11 → Alles andere (2,4,5,6,7,10)
Parameters:
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/quality_mask.py
download_gee_s2_sr_scene
¶
download_gee_s2_sr_scene(
s2item: str | ee.Image,
store: pathlib.Path,
bands_mapping: dict | typing.Literal["all"] = {
"B2": "blue",
"B3": "green",
"B4": "red",
"B8": "nir",
},
)
Download a Sentinel-2 scene from Google Earth Engine and store it in the local data store.
This function downloads Sentinel-2 Level-2A surface reflectance data from Google Earth Engine (GEE) and stores it locally in a compressed zarr store for efficient repeated access.
Parameters:
-
s2item(str | ee.Image) –Sentinel-2 scene identifier (e.g., "20230615T123456_20230615T123659_T33UUP") or an ee.Image object from the COPERNICUS/S2_SR collection.
-
store(pathlib.Path) –Path to the local zarr store directory where the scene will be saved.
-
bands_mapping(dict | typing.Literal['all'], default:{'B2': 'blue', 'B3': 'green', 'B4': 'red', 'B8': 'nir'}) –Mapping of Sentinel-2 band names to custom band names. Keys should be GEE band names (e.g., "B2", "B3"), values are the desired output names. Use "all" to load all optical bands and SCL. Defaults to {"B2": "blue", "B3": "green", "B4": "red", "B8": "nir"}.
Note
- Requires Google Earth Engine authentication. Use
ee.Initialize()before calling. - All bands are downloaded at 10m resolution.
- Data is stored with zstd compression for efficient storage.
- The SCL (Scene Classification Layer) band is automatically included if not specified.
Example
Download Sentinel-2 scenes from GEE:
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
get_aoi_from_gee_scene_ids
¶
Get the area of interest (AOI) as a GeoDataFrame from a list of Sentinel-2 scene IDs.
Parameters:
Returns:
-
geopandas.GeoDataFrame–gpd.GeoDataFrame: The AOI as a GeoDataFrame.
Raises:
-
ValueError–If no Sentinel-2 items are found for the given scene IDs.
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
get_gee_s2_sr_scene_ids_from_geodataframe
¶
get_gee_s2_sr_scene_ids_from_geodataframe(
aoi: geopandas.GeoDataFrame | pathlib.Path | str,
start_date: str | None = None,
end_date: str | None = None,
max_cloud_cover: int | None = 10,
max_snow_cover: int | None = 10,
) -> set[str]
Search for Sentinel-2 scenes via Earth Engine based on an aoi shapefile.
Parameters:
-
aoi(geopandas.GeoDataFrame | pathlib.Path | str) –AOI as a GeoDataFrame or path to a shapefile. If a path is provided, it will be read using geopandas.
-
start_date(str, default:None) –Starting date in a format readable by ee. If None, months and years parameters will be used for filtering if set. Defaults to None.
-
end_date(str, default:None) –Ending date in a format readable by ee. If None, months and years parameters will be used for filtering if set. Defaults to None.
-
max_cloud_cover(int, default:10) –Maximum percentage of cloud cover. Defaults to 10.
-
max_snow_cover(int, default:10) –Maximum percentage of snow cover. Defaults to 10.
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
get_gee_s2_sr_scene_ids_from_tile_ids
¶
get_gee_s2_sr_scene_ids_from_tile_ids(
tiles: list[str],
start_date: str | None = None,
end_date: str | None = None,
max_cloud_cover: int | None = 10,
max_snow_cover: int | None = 10,
) -> set[str]
Search for Sentinel-2 scenes via Earth Engine based on a list of tile IDs.
Parameters:
-
tiles(list[str]) –List of Sentinel-2 tile IDs.
-
start_date(str, default:None) –Starting date in a format readable by ee. If None, months and years parameters will be used for filtering if set. Defaults to None.
-
end_date(str, default:None) –Ending date in a format readable by ee. If None, months and years parameters will be used for filtering if set. Defaults to None.
-
max_cloud_cover(int, default:10) –Maximum percentage of cloud cover. Defaults to 10.
-
max_snow_cover(int, default:10) –Maximum percentage of snow cover. Defaults to 10.
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
load_gee_s2_sr_scene
¶
load_gee_s2_sr_scene(
s2item: str | ee.Image,
bands_mapping: dict | typing.Literal["all"] = {
"B2": "blue",
"B3": "green",
"B4": "red",
"B8": "nir",
},
store: pathlib.Path | None = None,
offline: bool = False,
output_dir_for_debug_geotiff: pathlib.Path
| None = None,
device: typing.Literal["cuda", "cpu"]
| int = darts_utils.cuda.DEFAULT_DEVICE,
) -> xarray.Dataset
Load a Sentinel-2 scene from Google Earth Engine, downloading if necessary.
This function loads Sentinel-2 Level-2A surface reflectance data from Google Earth Engine. If a local store is provided, the data is cached for efficient repeated access. The function handles quality masking, reflectance scaling with time-dependent offsets, and optional GPU acceleration. It also handles NaN values in the data by masking them as invalid.
The download logic is basically as follows:
IF flag:raw-data-store THEN
IF exist_local THEN
open -> memory
ELIF online THEN
download -> memory
save
ELIF offline THEN
RAISE ERROR
ENDIF
ELIF online THEN
download -> memory
ELIF offline THEN
RAISE ERROR
ENDIF
Parameters:
-
s2item(str | ee.Image) –Sentinel-2 scene identifier or ee.Image object from COPERNICUS/S2_SR.
-
bands_mapping(dict | typing.Literal['all'], default:{'B2': 'blue', 'B3': 'green', 'B4': 'red', 'B8': 'nir'}) –Mapping of Sentinel-2 band names to custom band names. Keys should be GEE band names (e.g., "B2", "B3"), values are output names. Use "all" to load all optical bands and SCL. Defaults to {"B2": "blue", "B3": "green", "B4": "red", "B8": "nir"}.
-
store(pathlib.Path | None, default:None) –Path to local zarr store for caching. If None, data is loaded directly without caching. Defaults to None.
-
offline(bool, default:False) –If True, only loads from local store without downloading. Requires
storeto be provided. If False, missing data is downloaded. Defaults to False. -
output_dir_for_debug_geotiff(pathlib.Path | None, default:None) –If provided, writes raw data as GeoTIFF files for debugging. Defaults to None.
-
device(typing.Literal['cuda', 'cpu'] | int, default:darts_utils.cuda.DEFAULT_DEVICE) –Device for processing (GPU or CPU). Defaults to DEFAULT_DEVICE.
Returns:
-
xarray.Dataset–xr.Dataset: Sentinel-2 dataset with the following data variables based on bands_mapping: - Optical bands (float32): Surface reflectance values [~-0.1 to ~1.0 for newer scenes, ~0.0 to ~1.0 for scenes before 2022-01-25] Default bands: blue, green, red, nir Additional bands available: coastal, rededge071, rededge075, rededge078, nir08, nir09, swir16, swir22 Each has attributes: - long_name: "Sentinel 2 {Band}" - units: "Reflectance" - data_source: "Sentinel-2 L2A via Google Earth Engine (COPERNICUS/S2_SR)" - s2_scl (uint8): Scene Classification Layer Attributes: long_name, description of class values (0=NO_DATA, 1=SATURATED, etc.) - quality_data_mask (uint8): Derived quality mask - 0 = Invalid (no data, saturated, defective, or NaN values) - 1 = Low quality (shadows, clouds, cirrus, snow/ice, water) - 2 = High quality (clear vegetation or non-vegetated land) - valid_data_mask (uint8): Binary validity mask (1=valid, 0=invalid)
Dataset attributes: - azimuth (float): Solar azimuth angle from MEAN_SOLAR_AZIMUTH_ANGLE - elevation (float): Solar elevation angle from MEAN_SOLAR_ZENITH_ANGLE - s2_tile_id (str): Full PRODUCT_ID from GEE - tile_id (str): Scene identifier - time (str): Acquisition timestamp
Note
The offline parameter controls data fetching:
- When offline=False: Automatically downloads missing data from GEE and stores it
in the local zarr store (if store is provided).
- When offline=True: Only reads from the local store. Raises an error if data is
missing or if store is None.
Reflectance processing: - For scenes >= 2022-01-25: (DN / 10000.0) - 0.1 (processing baseline 04.00+) - For scenes < 2022-01-25: DN / 10000.0 (older processing baseline) - NaN values are filled with 0 and marked as invalid in quality_data_mask - Pixels where SCL is NaN are also masked as invalid
This function handles spatially random NaN values that can occur in GEE data by marking them as invalid and filling with 0 to prevent propagation in calculations.
Quality mask derivation from SCL: - Invalid (0): NO_DATA, SATURATED_OR_DEFECTIVE, or NaN values - Low quality (1): CAST_SHADOWS, CLOUD_SHADOWS, CLOUD_*, THIN_CIRRUS, SNOW/ICE, WATER - High quality (2): VEGETATION, NOT_VEGETATED
Example
Load scene with local caching:
import ee
from pathlib import Path
from darts_acquisition import load_gee_s2_sr_scene
# Initialize Earth Engine
ee.Initialize()
# Load with caching
s2_ds = load_gee_s2_sr_scene(
s2item="20230615T123456_20230615T123659_T33UUP",
bands_mapping="all",
store=Path("/data/s2_store"),
offline=False # Download if not cached
)
# Compute NDVI
ndvi = (s2_ds.nir - s2_ds.red) / (s2_ds.nir + s2_ds.red)
# Filter to high quality pixels
s2_filtered = s2_ds.where(s2_ds.quality_data_mask == 2)
Source code in darts-acquisition/src/darts_acquisition/s2/gee_scene.py
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 | |
save_debug_geotiff
¶
save_debug_geotiff(
dataset: xarray.Dataset,
output_path: pathlib.Path,
optical_bands: list[str],
mask_bands: list[str] | None = None,
) -> None
Save the raw dataset as a GeoTIFF file for debugging purposes.
Parameters:
-
dataset(xarray.Dataset) –Dataset to save
-
output_path(pathlib.Path) –Path to the output GeoTIFF file
-
optical_bands(list[str]) –List of optical band names
-
mask_bands(list[str], default:None) –List of mask band names