darts.pipelines
¶
Predefined pipelines for DARTS.
PlanetPipeline
dataclass
¶
PlanetPipeline(
model_files: list[pathlib.Path] = None,
default_dirs: darts_utils.paths.DefaultPaths = (
lambda: darts_utils.paths.DefaultPaths()
)(),
output_data_dir: pathlib.Path | None = None,
arcticdem_dir: pathlib.Path | None = None,
tcvis_dir: pathlib.Path | None = None,
device: typing.Literal["cuda", "cpu", "auto"]
| int
| None = None,
ee_project: str | None = None,
ee_use_highvolume: bool = True,
tpi_outer_radius: int = 100,
tpi_inner_radius: int = 0,
patch_size: int = 1024,
overlap: int = 256,
batch_size: int = 8,
reflection: int = 0,
binarization_threshold: float = 0.5,
mask_erosion_size: int = 10,
edge_erosion_size: int | None = None,
min_object_size: int = 32,
quality_level: int
| typing.Literal[
"high_quality", "low_quality", "none"
] = 1,
export_bands: list[str] = (
lambda: [
"probabilities",
"binarized",
"polygonized",
"extent",
"thumbnail",
]
)(),
write_model_outputs: bool = False,
overwrite: bool = False,
offline: bool = False,
debug_data: bool = False,
orthotiles_dir: pathlib.Path | None = None,
scenes_dir: pathlib.Path | None = None,
image_ids: list = None,
)
Bases: darts.pipelines.sequential_v2._BasePipeline
Pipeline for processing PlanetScope data.
Processes PlanetScope imagery (both orthotiles and scenes) for RTS segmentation. Supports both offline and online processing modes.
Data Structure
Expects PlanetScope data organized as:
- Orthotiles: orthotiles_dir/tile_id/scene_id/
- Scenes: scenes_dir/scene_id/
Parameters:
-
orthotiles_dir(pathlib.Path | None, default:None) –Directory containing PlanetScope orthotiles. If None, uses default path from DARTS paths. Defaults to None.
-
scenes_dir(pathlib.Path | None, default:None) –Directory containing PlanetScope scenes. If None, uses default path from DARTS paths. Defaults to None.
-
image_ids(list | None, default:None) –List of image/scene IDs to process. If None, processes all images found in orthotiles_dir and scenes_dir. Defaults to None.
-
model_files(pathlib.Path | list[pathlib.Path] | None, default:None) –Path(s) to model file(s) for segmentation. Single Path implies
write_model_outputs=False. If None, searches default model directory for all .pt files. Defaults to None. -
output_data_dir(pathlib.Path | None, default:None) –Output directory for results. If None, uses
{default_out}/planet. Defaults to None. -
arcticdem_dir(pathlib.Path | None, default:None) –Directory for ArcticDEM datacube. Will be created/downloaded if needed. If None, uses default path. Defaults to None.
-
tcvis_dir(pathlib.Path | None, default:None) –Directory for TCVis data. If None, uses default path. Defaults to None.
-
device(typing.Literal['cuda', 'cpu', 'auto'] | int | None, default:None) –Computation device. "cuda" uses GPU 0, int specifies GPU index, "auto" selects free GPU. Defaults to None.
-
ee_project(str | None, default:None) –Earth Engine project ID. May be omitted if defined in persistent credentials. Defaults to None.
-
ee_use_highvolume(bool, default:True) –Whether to use EE high-volume server. Defaults to True.
-
tpi_outer_radius(int, default:100) –Outer radius (m) for TPI calculation. Defaults to 100.
-
tpi_inner_radius(int, default:0) –Inner radius (m) for TPI calculation. Defaults to 0.
-
patch_size(int, default:1024) –Patch size for inference. Defaults to 1024.
-
overlap(int, default:256) –Overlap between patches. Defaults to 256.
-
batch_size(int, default:8) –Batch size for inference. Defaults to 8.
-
reflection(int, default:0) –Reflection padding for inference. Defaults to 0.
-
binarization_threshold(float, default:0.5) –Threshold for binarizing probabilities. Defaults to 0.5.
-
mask_erosion_size(int, default:10) –Disk size for mask erosion and inner edge cropping. Defaults to 10.
-
edge_erosion_size(int | None, default:None) –Size for outer edge cropping. If None, uses
mask_erosion_size. Defaults to None. -
min_object_size(int, default:32) –Minimum object size (pixels) to keep. Defaults to 32.
-
quality_level(int | typing.Literal['high_quality', 'low_quality', 'none'], default:1) –Quality filtering level. 0="none", 1="low_quality", 2="high_quality". Defaults to 1.
-
export_bands(list[str], default:(lambda: ['probabilities', 'binarized', 'polygonized', 'extent', 'thumbnail'])()) –Bands to export. Can include "probabilities", "binarized", "polygonized", "extent", "thumbnail", "optical", "dem", "tcvis", "metadata", or specific band names. Defaults to ["probabilities", "binarized", "polygonized", "extent", "thumbnail"].
-
write_model_outputs(bool, default:False) –Save individual model outputs (not just ensemble). Defaults to False.
-
overwrite(bool, default:False) –Overwrite existing output files. Defaults to False.
-
offline(bool, default:False) –Skip downloading missing data. Defaults to False.
-
debug_data(bool, default:False) –Write intermediate debugging data. Defaults to False.
default_dirs
class-attribute
instance-attribute
¶
default_dirs: darts_utils.paths.DefaultPaths = dataclasses.field(
default_factory=lambda: darts_utils.paths.DefaultPaths()
)
device
class-attribute
instance-attribute
¶
export_bands
class-attribute
instance-attribute
¶
export_bands: list[str] = dataclasses.field(
default_factory=lambda: [
"probabilities",
"binarized",
"polygonized",
"extent",
"thumbnail",
]
)
quality_level
class-attribute
instance-attribute
¶
__post_init__
¶
Source code in darts/src/darts/pipelines/sequential_v2.py
cli
staticmethod
¶
cli(
*,
pipeline: darts.pipelines.sequential_v2.PlanetPipeline,
)
Run the sequential pipeline for PlanetScope data.
Parameters:
-
pipeline(darts.pipelines.sequential_v2.PlanetPipeline) –Configured PlanetPipeline instance.
cli_prepare_data
staticmethod
¶
cli_prepare_data(
*,
pipeline: darts.pipelines.sequential_v2.PlanetPipeline,
aux: bool = False,
)
Download all necessary data for offline processing.
Parameters:
-
pipeline(darts.pipelines.sequential_v2.PlanetPipeline) –Configured PlanetPipeline instance.
-
aux(bool, default:False) –If True, downloads auxiliary data (ArcticDEM, TCVis). Defaults to False.
Source code in darts/src/darts/pipelines/sequential_v2.py
prepare_data
¶
Download and prepare data for offline processing.
Validates configuration, determines data requirements from models, and downloads requested data (optical imagery and/or auxiliary data).
Parameters:
-
optical(bool, default:False) –If True, downloads optical imagery. Defaults to False.
-
aux(bool, default:False) –If True, downloads auxiliary data (ArcticDEM, TCVis) as needed. Defaults to False.
Raises:
-
KeyboardInterrupt–If user interrupts execution.
-
SystemExit–If the process is terminated.
-
SystemError–If a system error occurs.
Source code in darts/src/darts/pipelines/sequential_v2.py
323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 | |
run
¶
Run the complete segmentation pipeline.
Executes the full pipeline including: 1. Configuration validation and dumping 2. Loading ensemble models 3. Creating/loading auxiliary datacubes 4. Processing each tile: - Loading optical data - Loading auxiliary data (ArcticDEM, TCVis) as needed - Preprocessing - Segmentation - Postprocessing - Exporting results 5. Saving results and timing information
Results are saved to the output directory with timestamped configuration, results parquet file, and timing information.
Raises:
-
KeyboardInterrupt–If user interrupts execution.
Source code in darts/src/darts/pipelines/sequential_v2.py
404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 | |
PlanetRayPipeline
dataclass
¶
PlanetRayPipeline(
model_files: list[pathlib.Path] = None,
output_data_dir: pathlib.Path = pathlib.Path(
"data/output"
),
arcticdem_dir: pathlib.Path = pathlib.Path(
"data/download/arcticdem"
),
tcvis_dir: pathlib.Path = pathlib.Path(
"data/download/tcvis"
),
num_cpus: int = 1,
devices: list[int] | None = None,
ee_project: str | None = None,
ee_use_highvolume: bool = True,
tpi_outer_radius: int = 100,
tpi_inner_radius: int = 0,
patch_size: int = 1024,
overlap: int = 256,
batch_size: int = 8,
reflection: int = 0,
binarization_threshold: float = 0.5,
mask_erosion_size: int = 10,
min_object_size: int = 32,
quality_level: int
| typing.Literal[
"high_quality", "low_quality", "none"
] = 1,
export_bands: list[str] = (
lambda: [
"probabilities",
"binarized",
"polygonized",
"extent",
"thumbnail",
]
)(),
write_model_outputs: bool = False,
overwrite: bool = False,
orthotiles_dir: pathlib.Path = pathlib.Path(
"data/input/planet/PSOrthoTile"
),
scenes_dir: pathlib.Path = pathlib.Path(
"data/input/planet/PSScene"
),
image_ids: list = None,
)
Bases: darts.pipelines.ray_v2._BaseRayPipeline
Pipeline for PlanetScope data.
Parameters:
-
orthotiles_dir(pathlib.Path, default:pathlib.Path('data/input/planet/PSOrthoTile')) –The directory containing the PlanetScope orthotiles.
-
scenes_dir(pathlib.Path, default:pathlib.Path('data/input/planet/PSScene')) –The directory containing the PlanetScope scenes.
-
image_ids(list, default:None) –The list of image ids to process. If None, all images in the directory will be processed.
-
model_files(pathlib.Path | list[pathlib.Path], default:None) –The path to the models to use for segmentation. Can also be a single Path to only use one model. This implies
write_model_outputs=FalseIf a list is provided, will use an ensemble of the models. -
output_data_dir(pathlib.Path, default:pathlib.Path('data/output')) –The "output" directory. Defaults to Path("data/output").
-
arcticdem_dir(pathlib.Path, default:pathlib.Path('data/download/arcticdem')) –The directory containing the ArcticDEM data (the datacube and the extent files). Will be created and downloaded if it does not exist. Defaults to Path("data/download/arcticdem").
-
tcvis_dir(pathlib.Path, default:pathlib.Path('data/download/tcvis')) –The directory containing the TCVis data. Defaults to Path("data/download/tcvis").
-
device(typing.Literal['cuda', 'cpu'] | int) –The device to run the model on. If "cuda" take the first device (0), if int take the specified device. If "auto" try to automatically select a free GPU (<50% memory usage). Defaults to "cuda" if available, else "cpu".
-
ee_project(str, default:None) –The Earth Engine project ID or number to use. May be omitted if project is defined within persistent API credentials obtained via
earthengine authenticate. -
ee_use_highvolume(bool, default:True) –Whether to use the high volume server (https://earthengine-highvolume.googleapis.com).
-
tpi_outer_radius(int, default:100) –The outer radius of the annulus kernel for the tpi calculation in m. Defaults to 100m.
-
tpi_inner_radius(int, default:0) –The inner radius of the annulus kernel for the tpi calculation in m. Defaults to 0.
-
patch_size(int, default:1024) –The patch size to use for inference. Defaults to 1024.
-
overlap(int, default:256) –The overlap to use for inference. Defaults to 16.
-
batch_size(int, default:8) –The batch size to use for inference. Defaults to 8.
-
reflection(int, default:0) –The reflection padding to use for inference. Defaults to 0.
-
binarization_threshold(float, default:0.5) –The threshold to binarize the probabilities. Defaults to 0.5.
-
mask_erosion_size(int, default:10) –The size of the disk to use for mask erosion and the edge-cropping. Defaults to 10.
-
min_object_size(int, default:32) –The minimum object size to keep in pixel. Defaults to 32.
-
quality_level(int | typing.Literal['high_quality', 'low_quality', 'none'], default:1) –The quality level to use for the segmentation. Can also be an int. In this case 0="none" 1="low_quality" 2="high_quality". Defaults to 1.
-
export_bands(list[str], default:(lambda: ['probabilities', 'binarized', 'polygonized', 'extent', 'thumbnail'])()) –The bands to export. Can be a list of "probabilities", "binarized", "polygonized", "extent", "thumbnail", "optical", "dem", "tcvis" or concrete band-names. Defaults to ["probabilities", "binarized", "polygonized", "extent", "thumbnail"].
-
write_model_outputs(bool, default:False) –Also save the model outputs, not only the ensemble result. Defaults to False.
-
overwrite(bool, default:False) –Whether to overwrite existing files. Defaults to False.
arcticdem_dir
class-attribute
instance-attribute
¶
export_bands
class-attribute
instance-attribute
¶
export_bands: list[str] = dataclasses.field(
default_factory=lambda: [
"probabilities",
"binarized",
"polygonized",
"extent",
"thumbnail",
]
)
orthotiles_dir
class-attribute
instance-attribute
¶
output_data_dir
class-attribute
instance-attribute
¶
quality_level
class-attribute
instance-attribute
¶
scenes_dir
class-attribute
instance-attribute
¶
tcvis_dir
class-attribute
instance-attribute
¶
cli
staticmethod
¶
cli(*, pipeline: darts.pipelines.ray_v2.PlanetRayPipeline)
run
¶
Source code in darts/src/darts/pipelines/ray_v2.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 | |
Sentinel2Pipeline
dataclass
¶
Sentinel2Pipeline(
model_files: list[pathlib.Path] = None,
default_dirs: darts_utils.paths.DefaultPaths = (
lambda: darts_utils.paths.DefaultPaths()
)(),
output_data_dir: pathlib.Path | None = None,
arcticdem_dir: pathlib.Path | None = None,
tcvis_dir: pathlib.Path | None = None,
device: typing.Literal["cuda", "cpu", "auto"]
| int
| None = None,
ee_project: str | None = None,
ee_use_highvolume: bool = True,
tpi_outer_radius: int = 100,
tpi_inner_radius: int = 0,
patch_size: int = 1024,
overlap: int = 256,
batch_size: int = 8,
reflection: int = 0,
binarization_threshold: float = 0.5,
mask_erosion_size: int = 10,
edge_erosion_size: int | None = None,
min_object_size: int = 32,
quality_level: int
| typing.Literal[
"high_quality", "low_quality", "none"
] = 1,
export_bands: list[str] = (
lambda: [
"probabilities",
"binarized",
"polygonized",
"extent",
"thumbnail",
]
)(),
write_model_outputs: bool = False,
overwrite: bool = False,
offline: bool = False,
debug_data: bool = False,
scene_ids: list[str] | None = None,
scene_id_file: pathlib.Path | None = None,
tile_ids: list[str] | None = None,
aoi_file: pathlib.Path | None = None,
start_date: str | None = None,
end_date: str | None = None,
max_cloud_cover: int | None = 10,
max_snow_cover: int | None = 10,
months: list[int] | None = None,
years: list[int] | None = None,
prep_data_scene_id_file: pathlib.Path | None = None,
sentinel2_grid_dir: pathlib.Path | None = None,
raw_data_store: pathlib.Path | None = None,
no_raw_data_store: bool = False,
raw_data_source: typing.Literal["gee", "cdse"] = "cdse",
)
Bases: darts.pipelines.sequential_v2._BasePipeline
Pipeline for processing Sentinel-2 data.
Processes Sentinel-2 Surface Reflectance (SR) imagery from either CDSE or Google Earth Engine. Supports multiple scene selection methods and flexible filtering options.
Source Selection
The data source is specified via the raw_data_source parameter:
- "cdse": Copernicus Data Space Ecosystem (CDSE)
- "gee": Google Earth Engine (GEE)
Both sources require accounts and proper credential setup on the system.
Scene Selection
Scenes can be selected using one of four mutually exclusive methods (priority order):
scene_ids: Direct list of Sentinel-2 scene IDsscene_id_file: JSON file containing scene IDstile_ids: List of Sentinel-2 tile IDs (e.g., "33UVP") with optional filtersaoi_file: Shapefile defining area of interest with optional filters
Offline Processing
Use cli_prepare_data to download data for offline use.
The prep_data_scene_id_file stores scene IDs from queries for offline reuse.
Parameters:
-
scene_ids(list[str] | None, default:None) –Direct list of Sentinel-2 scene IDs to process. Defaults to None.
-
scene_id_file(pathlib.Path | None, default:None) –JSON file containing scene IDs to process. Defaults to None.
-
tile_ids(list[str] | None, default:None) –List of Sentinel-2 tile IDs (requires filtering params). Defaults to None.
-
aoi_file(pathlib.Path | None, default:None) –Shapefile with area of interest (requires filtering params). Defaults to None.
-
start_date(str | None, default:None) –Start date for filtering (YYYY-MM-DD format). Defaults to None.
-
end_date(str | None, default:None) –End date for filtering (YYYY-MM-DD format). Defaults to None.
-
max_cloud_cover(int | None, default:10) –Maximum cloud cover percentage (0-100). Defaults to 10.
-
max_snow_cover(int | None, default:10) –Maximum snow cover percentage (0-100). Defaults to 10.
-
months(list[int] | None, default:None) –Filter by months (1-12). Defaults to None.
-
years(list[int] | None, default:None) –Filter by years. Defaults to None.
-
prep_data_scene_id_file(pathlib.Path | None, default:None) –File to store/load scene IDs for offline processing. Written during
prepare_data, read during offlinerun. Defaults to None. -
sentinel2_grid_dir(pathlib.Path | None, default:None) –Directory for Sentinel-2 grid shapefiles. Used only in
prepare_datawithtile_ids. If None, uses default path. Defaults to None. -
raw_data_store(pathlib.Path | None, default:None) –Directory for storing raw Sentinel-2 data locally. If None, uses default path based on
raw_data_source. Defaults to None. -
no_raw_data_store(bool, default:False) –If True, processes data in-memory without local storage. Overrides
raw_data_store. Defaults to False. -
raw_data_source(typing.Literal['gee', 'cdse'], default:'cdse') –Data source to use. Defaults to "cdse".
-
model_files(pathlib.Path | list[pathlib.Path] | None, default:None) –Path(s) to model file(s) for segmentation. Single Path implies
write_model_outputs=False. If None, searches default model directory for all .pt files. Defaults to None. -
output_data_dir(pathlib.Path | None, default:None) –Output directory for results. If None, uses
{default_out}/sentinel2-{raw_data_source}. Defaults to None. -
arcticdem_dir(pathlib.Path | None, default:None) –Directory for ArcticDEM datacube. Will be created/downloaded if needed. If None, uses default path. Defaults to None.
-
tcvis_dir(pathlib.Path | None, default:None) –Directory for TCVis data. If None, uses default path. Defaults to None.
-
device(typing.Literal['cuda', 'cpu', 'auto'] | int | None, default:None) –Computation device. "cuda" uses GPU 0, int specifies GPU index, "auto" selects free GPU. Defaults to None.
-
ee_project(str | None, default:None) –Earth Engine project ID. May be omitted if defined in persistent credentials. Defaults to None.
-
ee_use_highvolume(bool, default:True) –Whether to use EE high-volume server. Defaults to True.
-
tpi_outer_radius(int, default:100) –Outer radius (m) for TPI calculation. Defaults to 100.
-
tpi_inner_radius(int, default:0) –Inner radius (m) for TPI calculation. Defaults to 0.
-
patch_size(int, default:1024) –Patch size for inference. Defaults to 1024.
-
overlap(int, default:256) –Overlap between patches. Defaults to 256.
-
batch_size(int, default:8) –Batch size for inference. Defaults to 8.
-
reflection(int, default:0) –Reflection padding for inference. Defaults to 0.
-
binarization_threshold(float, default:0.5) –Threshold for binarizing probabilities. Defaults to 0.5.
-
mask_erosion_size(int, default:10) –Disk size for mask erosion and inner edge cropping. Defaults to 10.
-
edge_erosion_size(int | None, default:None) –Size for outer edge cropping. If None, uses
mask_erosion_size. Defaults to None. -
min_object_size(int, default:32) –Minimum object size (pixels) to keep. Defaults to 32.
-
quality_level(int | typing.Literal['high_quality', 'low_quality', 'none'], default:1) –Quality filtering level. 0="none", 1="low_quality", 2="high_quality". Defaults to 1.
-
export_bands(list[str], default:(lambda: ['probabilities', 'binarized', 'polygonized', 'extent', 'thumbnail'])()) –Bands to export. Can include "probabilities", "binarized", "polygonized", "extent", "thumbnail", "optical", "dem", "tcvis", "metadata", or specific band names. Defaults to ["probabilities", "binarized", "polygonized", "extent", "thumbnail"].
-
write_model_outputs(bool, default:False) –Save individual model outputs (not just ensemble). Defaults to False.
-
overwrite(bool, default:False) –Overwrite existing output files. Defaults to False.
-
offline(bool, default:False) –Skip downloading missing data. Requires pre-downloaded data. Defaults to False.
-
debug_data(bool, default:False) –Write intermediate debugging data to output directory. Defaults to False.
default_dirs
class-attribute
instance-attribute
¶
default_dirs: darts_utils.paths.DefaultPaths = dataclasses.field(
default_factory=lambda: darts_utils.paths.DefaultPaths()
)
device
class-attribute
instance-attribute
¶
export_bands
class-attribute
instance-attribute
¶
export_bands: list[str] = dataclasses.field(
default_factory=lambda: [
"probabilities",
"binarized",
"polygonized",
"extent",
"thumbnail",
]
)
prep_data_scene_id_file
class-attribute
instance-attribute
¶
quality_level
class-attribute
instance-attribute
¶
raw_data_source
class-attribute
instance-attribute
¶
sentinel2_grid_dir
class-attribute
instance-attribute
¶
__post_init__
¶
Source code in darts/src/darts/pipelines/sequential_v2.py
cli
staticmethod
¶
cli(
*,
pipeline: darts.pipelines.sequential_v2.Sentinel2Pipeline,
)
Run the sequential pipeline for Sentinel-2 data.
Parameters:
-
pipeline(darts.pipelines.sequential_v2.Sentinel2Pipeline) –Configured Sentinel2Pipeline instance.
Source code in darts/src/darts/pipelines/sequential_v2.py
cli_prepare_data
staticmethod
¶
cli_prepare_data(
*,
pipeline: darts.pipelines.sequential_v2.Sentinel2Pipeline,
optical: bool = False,
aux: bool = False,
)
Download all necessary data for offline processing.
Queries the data source (CDSE or GEE) for scene IDs and downloads optical and/or auxiliary data.
Stores scene IDs in prep_data_scene_id_file if specified for later offline use.
Parameters:
-
pipeline(darts.pipelines.sequential_v2.Sentinel2Pipeline) –Configured Sentinel2Pipeline instance.
-
optical(bool, default:False) –If True, downloads optical (Sentinel-2) imagery. Defaults to False.
-
aux(bool, default:False) –If True, downloads auxiliary data (ArcticDEM, TCVis). Defaults to False.
Source code in darts/src/darts/pipelines/sequential_v2.py
prepare_data
¶
Download and prepare data for offline processing.
Validates configuration, determines data requirements from models, and downloads requested data (optical imagery and/or auxiliary data).
Parameters:
-
optical(bool, default:False) –If True, downloads optical imagery. Defaults to False.
-
aux(bool, default:False) –If True, downloads auxiliary data (ArcticDEM, TCVis) as needed. Defaults to False.
Raises:
-
KeyboardInterrupt–If user interrupts execution.
-
SystemExit–If the process is terminated.
-
SystemError–If a system error occurs.
Source code in darts/src/darts/pipelines/sequential_v2.py
323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 | |
run
¶
Run the complete segmentation pipeline.
Executes the full pipeline including: 1. Configuration validation and dumping 2. Loading ensemble models 3. Creating/loading auxiliary datacubes 4. Processing each tile: - Loading optical data - Loading auxiliary data (ArcticDEM, TCVis) as needed - Preprocessing - Segmentation - Postprocessing - Exporting results 5. Saving results and timing information
Results are saved to the output directory with timestamped configuration, results parquet file, and timing information.
Raises:
-
KeyboardInterrupt–If user interrupts execution.
Source code in darts/src/darts/pipelines/sequential_v2.py
404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 | |
Sentinel2RayPipeline
dataclass
¶
Sentinel2RayPipeline(
model_files: list[pathlib.Path] = None,
output_data_dir: pathlib.Path = pathlib.Path(
"data/output"
),
arcticdem_dir: pathlib.Path = pathlib.Path(
"data/download/arcticdem"
),
tcvis_dir: pathlib.Path = pathlib.Path(
"data/download/tcvis"
),
num_cpus: int = 1,
devices: list[int] | None = None,
ee_project: str | None = None,
ee_use_highvolume: bool = True,
tpi_outer_radius: int = 100,
tpi_inner_radius: int = 0,
patch_size: int = 1024,
overlap: int = 256,
batch_size: int = 8,
reflection: int = 0,
binarization_threshold: float = 0.5,
mask_erosion_size: int = 10,
min_object_size: int = 32,
quality_level: int
| typing.Literal[
"high_quality", "low_quality", "none"
] = 1,
export_bands: list[str] = (
lambda: [
"probabilities",
"binarized",
"polygonized",
"extent",
"thumbnail",
]
)(),
write_model_outputs: bool = False,
overwrite: bool = False,
aoi_shapefile: pathlib.Path = None,
start_date: str = None,
end_date: str = None,
max_cloud_cover: int = 10,
input_cache: pathlib.Path = pathlib.Path(
"data/cache/input"
),
)
Bases: darts.pipelines.ray_v2._BaseRayPipeline
Pipeline for Sentinel 2 data based on an area of interest.
Parameters:
-
aoi_shapefile(pathlib.Path, default:None) –The shapefile containing the area of interest.
-
start_date(str, default:None) –The start date of the time series in YYYY-MM-DD format.
-
end_date(str, default:None) –The end date of the time series in YYYY-MM-DD format.
-
max_cloud_cover(int, default:10) –The maximum cloud cover percentage to use for filtering the Sentinel 2 scenes. Defaults to 10.
-
input_cache(pathlib.Path, default:pathlib.Path('data/cache/input')) –The directory to use for caching the input data. Defaults to Path("data/cache/input").
-
model_files(pathlib.Path | list[pathlib.Path], default:None) –The path to the models to use for segmentation. Can also be a single Path to only use one model. This implies
write_model_outputs=FalseIf a list is provided, will use an ensemble of the models. -
output_data_dir(pathlib.Path, default:pathlib.Path('data/output')) –The "output" directory. Defaults to Path("data/output").
-
arcticdem_dir(pathlib.Path, default:pathlib.Path('data/download/arcticdem')) –The directory containing the ArcticDEM data (the datacube and the extent files). Will be created and downloaded if it does not exist. Defaults to Path("data/download/arcticdem").
-
tcvis_dir(pathlib.Path, default:pathlib.Path('data/download/tcvis')) –The directory containing the TCVis data. Defaults to Path("data/download/tcvis").
-
device(typing.Literal['cuda', 'cpu'] | int) –The device to run the model on. If "cuda" take the first device (0), if int take the specified device. If "auto" try to automatically select a free GPU (<50% memory usage). Defaults to "cuda" if available, else "cpu".
-
ee_project(str, default:None) –The Earth Engine project ID or number to use. May be omitted if project is defined within persistent API credentials obtained via
earthengine authenticate. -
ee_use_highvolume(bool, default:True) –Whether to use the high volume server (https://earthengine-highvolume.googleapis.com).
-
tpi_outer_radius(int, default:100) –The outer radius of the annulus kernel for the tpi calculation in m. Defaults to 100m.
-
tpi_inner_radius(int, default:0) –The inner radius of the annulus kernel for the tpi calculation in m. Defaults to 0.
-
patch_size(int, default:1024) –The patch size to use for inference. Defaults to 1024.
-
overlap(int, default:256) –The overlap to use for inference. Defaults to 16.
-
batch_size(int, default:8) –The batch size to use for inference. Defaults to 8.
-
reflection(int, default:0) –The reflection padding to use for inference. Defaults to 0.
-
binarization_threshold(float, default:0.5) –The threshold to binarize the probabilities. Defaults to 0.5.
-
mask_erosion_size(int, default:10) –The size of the disk to use for mask erosion and the edge-cropping. Defaults to 10.
-
min_object_size(int, default:32) –The minimum object size to keep in pixel. Defaults to 32.
-
quality_level(int | typing.Literal['high_quality', 'low_quality', 'none'], default:1) –The quality level to use for the segmentation. Can also be an int. In this case 0="none" 1="low_quality" 2="high_quality". Defaults to 1.
-
export_bands(list[str], default:(lambda: ['probabilities', 'binarized', 'polygonized', 'extent', 'thumbnail'])()) –The bands to export. Can be a list of "probabilities", "binarized", "polygonized", "extent", "thumbnail", "optical", "dem", "tcvis" or concrete band-names. Defaults to ["probabilities", "binarized", "polygonized", "extent", "thumbnail"].
-
write_model_outputs(bool, default:False) –Also save the model outputs, not only the ensemble result. Defaults to False.
-
overwrite(bool, default:False) –Whether to overwrite existing files. Defaults to False.
arcticdem_dir
class-attribute
instance-attribute
¶
export_bands
class-attribute
instance-attribute
¶
export_bands: list[str] = dataclasses.field(
default_factory=lambda: [
"probabilities",
"binarized",
"polygonized",
"extent",
"thumbnail",
]
)
input_cache
class-attribute
instance-attribute
¶
output_data_dir
class-attribute
instance-attribute
¶
quality_level
class-attribute
instance-attribute
¶
tcvis_dir
class-attribute
instance-attribute
¶
cli
staticmethod
¶
cli(
*, pipeline: darts.pipelines.ray_v2.Sentinel2RayPipeline
)
run
¶
Source code in darts/src/darts/pipelines/ray_v2.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 | |