Enhance CSV metadata with error tracking for mosaics and tiles
This commit is contained in:
@@ -153,13 +153,17 @@ Tile filenames encode position: `tile_r{row}_c{col}.jpg` where row increases wit
|
||||
|
||||
### Metadata files
|
||||
|
||||
**`scans.csv`** columns: `machine`, `machine_id`, `scan_id`, `name`, `scan_time`, `start_x`, `start_y`, `end_x`, `end_y`, `dx`, `dy`, `nx`, `ny`, `total_tiles`, `scan_lines`, `scan_mode`, `start_datetime`, `end_datetime`, `status`, `user`, `disk_space_mb`, `mosaic_url`, `mosaic_local_path`, `mosaic_on_disk`
|
||||
**`scans.csv`** columns: `machine`, `machine_id`, `scan_id`, `name`, `scan_time`, `start_x`, `start_y`, `end_x`, `end_y`, `dx`, `dy`, `nx`, `ny`, `total_tiles`, `scan_lines`, `scan_mode`, `start_datetime`, `end_datetime`, `status`, `user`, `disk_space_mb`, `mosaic_url`, `mosaic_local_path`, `mosaic_on_disk`, `mosaic_download_status`, `mosaic_error`, `mosaic_error_code`, `mosaic_error_class`
|
||||
|
||||
- `mosaic_on_disk`: `True` if `mosaic.jpg` exists on disk at row-write time, regardless of which run downloaded it. Useful for inventory — reflects actual archive state rather than what happened in the current run.
|
||||
- `mosaic_download_status`: one of `downloaded`, `failed`, `already_done`, `dry_run`, `skipped_metadata_only` (in `--metadata-only` mode). Failed attempts are still written so you can see missing server-side images in the same CSV.
|
||||
- `mosaic_error` / `mosaic_error_code` / `mosaic_error_class`: set when the URL was tried and the file was not stored successfully. **`mosaic_error_class`** is a coarse hint: `permanent_missing` for HTTP 404/410, `transient` for 5xx or common network/timeout-style failures, and `unknown` for other cases (including a 200 with an empty body). **Rows are append-only;** a failed download leaves an audit record without overwriting prior runs’ history. Delete or rotate the CSVs if you need a new header (see `spruce.settings.SCANS_CSV_FIELDS` / `TILES_CSV_FIELDS`).
|
||||
|
||||
**`tiles.csv`** columns: `machine`, `machine_id`, `scan_id`, `scan_time`, `row_index`, `col_index`, `x_mm`, `y_mm`, `url`, `local_path`, `downloaded_at`, `file_size_bytes`
|
||||
**`tiles.csv`** columns: `machine`, `machine_id`, `scan_id`, `scan_time`, `row_index`, `col_index`, `x_mm`, `y_mm`, `url`, `local_path`, `status`, `error`, `error_code`, `error_class`, `downloaded_at`, `file_size_bytes`
|
||||
|
||||
- `downloaded_at`: ISO 8601 UTC timestamp of when the tile was fetched. Empty if the download failed.
|
||||
- `status`: `downloaded`, `failed`, or `dry_run` (if `--dry-run`). Failed rows are kept for the same reason as mosaics.
|
||||
- `error` / `error_code` / `error_class`: same rough semantics as the mosaic fields (`permanent_missing` / `transient` / `unknown`). `error_code` is the HTTP status when available.
|
||||
- `downloaded_at`: ISO 8601 UTC timestamp when the tile was fetched. Empty on failure.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user