darts_acquisition.landsat
¶
Landsat related data loading. Should be used temporary and maybe moved to the acquisition package.
Period
module-attribute
¶
LandsatStoreManager
¶
LandsatStoreManager(
store: pathlib.Path | str | None,
bands_mapping: dict[str, str],
aws_profile_name: str,
)
Bases: darts_acquisition.s2.raw_data_store.StoreManager[pystac.Item]
Raw Data Store manager for Landsat.
Initialize the store manager.
Parameters:
-
store(str | pathlib.Path | None) –Directory path for storing raw landsat data
-
bands_mapping(dict[str, str]) –A mapping from bands to obtain.
-
aws_profile_name(str) –AWS profile name for authentication
Source code in darts-acquisition/src/darts_acquisition/landsat.py
aws_profile_name
instance-attribute
¶
aws_profile_name = (
darts_acquisition.landsat.LandsatStoreManager(
aws_profile_name
)
)
store
instance-attribute
¶
store = (
pathlib.Path(
darts_acquisition.s2.raw_data_store.StoreManager(
store
)
)
if isinstance(
darts_acquisition.s2.raw_data_store.StoreManager(
store
),
str,
)
else darts_acquisition.s2.raw_data_store.StoreManager(
store
)
)
complete
¶
download_and_store
¶
download_and_store(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
)
Download a scene from the source and store it in the local store.
Store must be provided! Will do nothing if all required bands are already present.
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open.
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
download_scene_from_source
¶
Download a Landsat mosaic tile (the "scene") from CDSE via STAC API.
Parameters:
-
landsat_item(str | pystac.Item) –The Landsat image ID or the corresponing STAC Item.
-
bands(list[str]) –List of bands to download.
Returns:
Source code in darts-acquisition/src/darts_acquisition/landsat.py
encodings
¶
Source code in darts-acquisition/src/darts_acquisition/landsat.py
exists
¶
Check if a scene already exists in the local raw data store.
Parameters:
-
identifier(str) –Unique identifier for the scene
Returns:
-
bool(bool) –True if the scene exists in the store, False otherwise
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
identifier
¶
load
¶
load(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
force: bool = False,
) -> xarray.Dataset
Load a scene.
If force==True will download the scene from source even if present in store.
Else, will try to open the scene from store first and only download missing bands.
Will always store the downloaded scene in local store if store is provided, potentially overwriting existing.
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open.
-
force(bool, default:False) –If True, will download the scene even if present. Defaults to False.
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
missing_bands
¶
Get the list of missing bands for a scene in the store.
Parameters:
-
identifier(str) –Unique identifier for the scene
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
open
¶
open(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
) -> xarray.Dataset
Open a scene from local store.
Store must be provided and the scene must be present in store!
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
save_to_store
¶
Save a scene dataset to the local raw data store.
Will append new bands to existing store if scene already exists. Will overwrite existing bands in an existing store if scene already exists.
Parameters:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
StoreManager
¶
Bases: abc.ABC, typing.Generic[darts_acquisition.s2.raw_data_store.SceneItem]
Manager for storing raw sentinel 2 data.
This class is an abstract base class and should be extended to implement the respective downloading methods.
Usage:
1. "Normal" usage:
```python
store_manager = StoreManager(store_path)
ds_s2 = store_manager.load(identifier, bands)
```
2. Force download:
```python
store_manager = StoreManager(store_path)
ds_s2 = store_manager.load(identifier, force=True)
```
3. Download only (and only if missing) and store the scene:
```python
store_manager = StoreManager(store_path)
store_manager.download(identifier) # store_path must be not None
```
4. Offline mode:
```python
store_manager = StoreManager(store_path)
store_manager.open(identifier) # store_path must be not None, bands must be complete
```
Initialize the store manager.
Parameters:
-
bands(list[str]) –List of bands to manage
-
store(str | pathlib.Path | None, default:None) –Directory path for storing raw sentinel 2 data
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
store
instance-attribute
¶
store = (
pathlib.Path(
darts_acquisition.s2.raw_data_store.StoreManager(
store
)
)
if isinstance(
darts_acquisition.s2.raw_data_store.StoreManager(
store
),
str,
)
else darts_acquisition.s2.raw_data_store.StoreManager(
store
)
)
complete
¶
download_and_store
¶
download_and_store(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
)
Download a scene from the source and store it in the local store.
Store must be provided! Will do nothing if all required bands are already present.
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open.
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
download_scene_from_source
abstractmethod
¶
download_scene_from_source(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
bands: list[str],
) -> xarray.Dataset
encodings
abstractmethod
¶
exists
¶
Check if a scene already exists in the local raw data store.
Parameters:
-
identifier(str) –Unique identifier for the scene
Returns:
-
bool(bool) –True if the scene exists in the store, False otherwise
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
identifier
abstractmethod
¶
identifier(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
) -> str
load
¶
load(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
force: bool = False,
) -> xarray.Dataset
Load a scene.
If force==True will download the scene from source even if present in store.
Else, will try to open the scene from store first and only download missing bands.
Will always store the downloaded scene in local store if store is provided, potentially overwriting existing.
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open.
-
force(bool, default:False) –If True, will download the scene even if present. Defaults to False.
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
missing_bands
¶
Get the list of missing bands for a scene in the store.
Parameters:
-
identifier(str) –Unique identifier for the scene
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
open
¶
open(
item: str
| darts_acquisition.s2.raw_data_store.SceneItem,
) -> xarray.Dataset
Open a scene from local store.
Store must be provided and the scene must be present in store!
Parameters:
-
item(str | darts_acquisition.s2.raw_data_store.SceneItem) –Item or scene-id to open
Returns:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
save_to_store
¶
Save a scene dataset to the local raw data store.
Will append new bands to existing store if scene already exists. Will overwrite existing bands in an existing store if scene already exists.
Parameters:
Source code in darts-acquisition/src/darts_acquisition/s2/raw_data_store.py
_build_cql2_filter
¶
Source code in darts-acquisition/src/darts_acquisition/landsat.py
_flatten_dict
¶
_flatten_dict(
d: collections.abc.MutableMapping,
parent_key: str = "",
sep: str = ".",
) -> collections.abc.MutableMapping
Source code in darts-acquisition/src/darts_acquisition/landsat.py
_get_band_mapping
¶
Source code in darts-acquisition/src/darts_acquisition/landsat.py
create_quality_mask_from_clear_sky_mask
¶
Create a quality mask from the clear_sky_mask band.
Quality mask derivation from clear_sky_mask: - 0 = Invalid (clear_sky_mask == 0) - 1 = Low quality (0 < clear_sky_mask < 1) - 2 = High quality (clear_sky_mask == 1)
Parameters:
Returns:
Source code in darts-acquisition/src/darts_acquisition/landsat.py
download_cdse_landsat_mosaic
¶
download_cdse_landsat_mosaic(
landsat_item: str | pystac.Item,
store: pathlib.Path,
bands_mapping: dict | typing.Literal["all"] = "all",
aws_profile_name: str = "default",
)
Download a Landsat mosaic from CDSE via STAC API and store it in the local data store.
This function downloads Landsat mosaics from the Copernicus Data Space Ecosystem (CDSE) and stores it locally in a compressed zarr store for efficient repeated access.
Parameters:
-
landsat_item(str | pystac.Item) –Landsat mosaic identifier (e.g., "Landsat_mosaic_2024_11-12_83N068W_V1.0") or a PySTAC Item object from a STAC search.
-
store(pathlib.Path) –Path to the local zarr store directory where the mosaic will be saved.
-
bands_mapping(dict | typing.Literal['all'], default:'all') –Mapping of Landsat band names to custom band names. Keys should be CDSE band names (e.g., "B02", "B03", "B04", "B08"), values are the desired output names. Use "all" to load all optical bands and SCL. Defaults to "all".
-
aws_profile_name(str, default:'default') –AWS profile name for authentication with the Copernicus S3 bucket. Defaults to "default".
Note
- Requires Copernicus Data Space authentication. Use
darts_utils.copernicus.init_copernicus()to set up credentials before calling this function. - All bands are resampled to 10m resolution during download.
- Data is stored with zstd compression for efficient storage.
- The "clear_sky_mask" band is automatically included if not specified.
Example
Download Landsat mosaic for a project:
from pathlib import Path
from darts_acquisition import download_cdse_landsat_mosaic
from darts_utils.copernicus import init_copernicus
# Setup authentication
init_copernicus(profile_name="default")
# Download scene with all bands
download_cdse_landsat_mosaic(
landsat_item="Landsat_mosaic_2025_Q3_60WWS_0_0",
store=Path("/data/landsat_store"),
bands_mapping="all",
aws_profile_name="default"
)
Source code in darts-acquisition/src/darts_acquisition/landsat.py
get_aoi_from_cdse_landsat_mosaic_ids
¶
Get the area of interest (AOI) as a GeoDataFrame from a list of Landsat mosaic IDs.
Parameters:
Returns:
-
geopandas.GeoDataFrame–gpd.GeoDataFrame: The AOI as a GeoDataFrame.
Raises:
-
ValueError–If no Landsat items are found for the given scene IDs.
Source code in darts-acquisition/src/darts_acquisition/landsat.py
get_cdse_landsat_mosaic_ids_from_geodataframe
¶
get_cdse_landsat_mosaic_ids_from_geodataframe(
aoi: geopandas.GeoDataFrame | pathlib.Path | str,
periods: list[darts_acquisition.landsat.Period]
| None = None,
years: list[int] | None = None,
simplify_geometry: float
| typing.Literal[False] = False,
) -> dict[str, pystac.Item]
Search for Landsat mosaics via STAC based on an area of interest (aoi).
Parameters:
-
aoi(geopandas.GeoDataFrame | pathlib.Path | str) –AOI as a GeoDataFrame or path to a shapefile. If a path is provided, it will be read using geopandas.
-
periods(list[typing.Literal['01-02', '03-04', '05-06', '07-08', '09-10', '11-12']] | None, default:None) –List of quarters to filter the search. Defaults to None.
-
years(list[int] | None, default:None) –List of years to filter the search. Defaults to None.
-
simplify_geometry(float | typing.Literal[False], default:False) –If a float is provided, the geometry will be simplified using the
simplifymethod of geopandas. If False, no simplification will be done. This may become useful for large / weird AOIs which are too large for the STAC API. Defaults to False.
Returns:
Source code in darts-acquisition/src/darts_acquisition/landsat.py
get_cdse_landsat_mosaic_ids_from_tile_ids
¶
get_cdse_landsat_mosaic_ids_from_tile_ids(
tile_ids: list[str],
periods: list[darts_acquisition.landsat.Period]
| None = None,
years: list[int] | None = None,
) -> dict[str, pystac.Item]
Search for Landsat scenes via STAC based on a list of tile IDs.
Parameters:
-
tile_ids(list[str]) –List of MGRS tile IDs to search for.
-
periods(list[typing.Literal['01-02', '03-04', '05-06', '07-08', '09-10', '11-12']] | None, default:None) –List of quarters to filter the search. Defaults to None.
-
years(list[int] | None, default:None) –List of years to filter the search. Defaults to None.
Returns:
Source code in darts-acquisition/src/darts_acquisition/landsat.py
init_copernicus
¶
init_copernicus(profile_name: str = 'default')
Configure odc.stac and rio to authenticate with Copernicus cloud.
This functions expects that credentials are present in the .aws/credentials file. Credentials can be obtained from https://eodata-s3keysmanager.dataspace.copernicus.eu/
Example credentials file:
Parameters:
-
profile_name(str, default:'default') –The boto3 profile name. This must match with the name in the credentials file!. Defaults to "default".
References
- S3 access: https://documentation.dataspace.copernicus.eu/APIs/S3.html
Source code in darts-acquisition/src/darts_acquisition/utils/copernicus.py
load_cdse_landsat_mosaic
¶
load_cdse_landsat_mosaic(
landsat_item: str | pystac.Item,
bands_mapping: dict | typing.Literal["all"] = "all",
store: pathlib.Path | None = None,
aws_profile_name: str = "default",
offline: bool = False,
output_dir_for_debug_geotiff: pathlib.Path
| None = None,
device: typing.Literal["cuda", "cpu"]
| int = darts_utils.cuda.DEFAULT_DEVICE,
) -> xarray.Dataset
Load a Landsat mosaic from CDSE, downloading from STAC API if necessary.
This function loads Landsat mosaic data from the Copernicus Data Space Ecosystem (CDSE). If a local store is provided, the data is cached for efficient repeated access. The function handles quality masking, reflectance scaling, and optional GPU acceleration.
The download logic is basically as follows:
IF flag:raw-data-store THEN
IF exist_local THEN
open -> memory
ELIF online THEN
download -> memory
save
ELIF offline THEN
RAISE ERROR
ENDIF
ELIF online THEN
download -> memory
ELIF offline THEN
RAISE ERROR
ENDIF
Parameters:
-
landsat_item(str | pystac.Item) –Landsat mosaic identifier or PySTAC Item object.
-
bands_mapping(dict | typing.Literal['all'], default:'all') –Mapping of Landsat band names to custom band names. Keys should be CDSE band names (e.g., "B02"), values are output names. Use "all" to load all optical bands and the clear_sky_mask band. Defaults to "all".
-
store(pathlib.Path | None, default:None) –Path to local zarr store for caching. If None, data is loaded directly without caching. Defaults to None.
-
aws_profile_name(str, default:'default') –AWS profile name for Copernicus S3 authentication. Defaults to "default".
-
offline(bool, default:False) –If True, only loads from local store without downloading. Requires
storeto be provided. If False, missing data is downloaded. Defaults to False. -
output_dir_for_debug_geotiff(pathlib.Path | None, default:None) –If provided, writes raw data as GeoTIFF files for debugging. Defaults to None.
-
device(typing.Literal['cuda', 'cpu'] | int, default:darts_utils.cuda.DEFAULT_DEVICE) –Device for processing (GPU or CPU). Defaults to DEFAULT_DEVICE.
Returns:
-
xarray.Dataset–xr.Dataset: Landsat dataset with the following data variables based on bands_mapping: - Optical bands (uint8): Surface reflectance values [0 to 1 after scaling] Default bands: blue, green, red, nir Each has attributes: - long_name: "Landsat {Band}" - units: "Reflectance" - data_source: "Landsat Global Mosaics via Copernicus STAC API (opengeohub-landsat-bimonthly-mosaic-v1.0.1)" - clear_sky_mask (uint8): Layer with the relative number of clear sky observations per pixel (0-1) Attributes: long_name - quality_data_mask (uint8): Derived quality mask - 0 = Invalid (clear_sky_mask == 0) - 1 = Low quality (0 < clear_sky_mask < 1) - 2 = High quality (clear_sky_mask == 1) - valid_data_mask (uint8): Binary validity mask (1=valid, 0=invalid)
Dataset attributes: - landsat_tile_id (str): Mosaic identifier - tile_id (str): Mosaic identifier (same as landsat_tile_id) - Plus additional STAC metadata fields
Note
The offline parameter controls data fetching:
- When offline=False: Automatically downloads missing data from CDSE and stores it
in the local zarr store (if store is provided).
- When offline=True: Only reads from the local store. Raises an error if data is
missing or if store is None.
Reflectance processing: - Raw DN values are scaled: (DN / 250) (see https://peerj.com/articles/18585/) - Pixels where clear_sky_mask == 0 are masked as NaN - This matches the data format from GEE and Planet loaders
Quality mask derivation from clear_sky_mask: - Invalid (0): clear_sky_mask == 0 - Low quality (1): 0 < clear_sky_mask < 1 - High quality (2): clear_sky_mask == 1
Example
Load mosaic with local caching:
from pathlib import Path
from darts_acquisition import load_cdse_s2_mosaic
from darts_utils.copernicus import init_copernicus
# Setup authentication
init_copernicus(profile_name="default")
# Load with caching
landsat_ds = load_cdse_s2_mosaic(
landsat_item="Sentinel-2_mosaic_2025_Q3_60WWS_0_0",
bands_mapping="all",
store=Path("/data/s2_store"),
offline=False # Download if not cached
)
# Compute NDVI
ndvi = (landsat_ds.nir - landsat_ds.red) / (landsat_ds.nir + landsat_ds.red)
# Filter to high quality pixels
s2_filtered = landsat_ds.where(landsat_ds.quality_data_mask == 2)
Source code in darts-acquisition/src/darts_acquisition/landsat.py
240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 | |
match_cdse_landsat_mosaic_ids_from_geodataframe
¶
match_cdse_landsat_mosaic_ids_from_geodataframe(
aoi: geopandas.GeoDataFrame,
min_intersects: float = 0.7,
simplify_geometry: float
| typing.Literal[False] = False,
save_scores: pathlib.Path | None = None,
) -> dict[int, pystac.Item | None]
Match items from a GeoDataFrame with Landsat items from the STAC API based on a date range.
Parameters:
-
aoi(geopandas.GeoDataFrame) –The area of interest as a GeoDataFrame.
-
min_intersects(float, default:0.7) –The minimum intersection area ratio to consider a match. Defaults to 0.7.
-
simplify_geometry(float | typing.Literal[False], default:False) –If a float is provided, the geometry will be simplified using the
simplifymethod of geopandas. If False, no simplification will be done. This may become useful for large / weird AOIs which are too large for the STAC API. Defaults to False. -
save_scores(pathlib.Path | None, default:None) –If provided, the scores will be saved to this path as a Parquet file.
Returns:
-
dict[int, pystac.Item | None]–dict[int, Item | None]: A dictionary mapping each row to its best matching Landsat item. The keys are the indices of the rows in the GeoDataFrame, and the values are the matching Landsat items. If no matching item is found, the value will be None.
Raises:
-
ValueError–If the 'date' column is not present or not of type datetime.
Source code in darts-acquisition/src/darts_acquisition/landsat.py
614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 | |
save_debug_geotiff
¶
save_debug_geotiff(
dataset: xarray.Dataset,
output_path: pathlib.Path,
optical_bands: list[str],
mask_bands: list[str] | None = None,
) -> None
Save the raw dataset as a GeoTIFF file for debugging purposes.
Parameters:
-
dataset(xarray.Dataset) –Dataset to save
-
output_path(pathlib.Path) –Path to the output GeoTIFF file
-
optical_bands(list[str]) –List of optical band names
-
mask_bands(list[str], default:None) –List of mask band names
Source code in darts-acquisition/src/darts_acquisition/s2/debug_export.py
search_cdse_landsat_mosaic
¶
search_cdse_landsat_mosaic(
intersects=None,
tiles: list[str] | None = None,
periods: list[darts_acquisition.landsat.Period]
| None = None,
years: list[int] | None = None,
) -> dict[str, pystac.Item]
Search for Landsat mosaics via STAC based on an area of interest (intersects) and date range.
Note
start_date and end_date will be concatted with a / to form a date range.
Read more about the date format here: https://pystac-client.readthedocs.io/en/stable/api.html#pystac_client.Client.search
Parameters:
-
intersects(any, default:None) –The geometry object to search for Landsat tiles. Can be anything implementing the
__geo_interface__protocol, such as a GeoDataFrame or a shapely geometry. If None, and tiles is also None, the search will be performed globally. If set and tiles is also set, will be ignored. -
tiles(list[str] | None, default:None) –List of CDEM tile IDs to filter the search. If set, ignores intersects parameter. Defaults to None.
-
periods(list[typing.Literal['01-02', '03-04', '05-06', '07-08', '09-10', '11-12']] | None, default:None) –List of periods to filter the search. Defaults to None.
-
years(list[int] | None, default:None) –List of years to filter the search. Defaults to None.
Returns:
-
dict[str, pystac.Item]–dict[str, Item]: A dictionary of found Sentinel-2 items as values and the s2id as keys.