Scraping resilience, metadata tooling, and repository hygiene

Consolidates mosaic and session hardening (login retry, skip processed scans, no retry on 404, started_at), progress reporting (Markdown tables, by-year rollup, rolling-window rate/ETA), and metadata workflow scripts (run_metadata_scan.sh, scan_progress_report.py, export_machine_metadata.py). Adds mosaic reconstruction sample JPEGs referenced by the report. Updates .gitignore for backup/ and .claude/; sample_random_scans helper is documented for branch testing/sample-runs only (see README).
This commit is contained in:
2026-05-14 19:52:53 -04:00
parent 752c278dff
commit 6390f5d529
23 changed files with 788 additions and 188 deletions
+1 -1
View File
@@ -1,6 +1,6 @@
# All RootView minirhizotron machine labels (same set as `machine_metadata` in config.example.yaml).
# Copy to the repo root as machines.txt, or: cp scripts/machines.example.txt machines.txt
# sample_random_scans.sh: by default one random scan per line = mosaic + tiles; use MOSAIC_ONLY=1 for mosaics only
# Random-sample helper `scripts/sample_random_scans.sh` lives on branch `testing/sample-runs` only.
BW1-4 [AMR-15]
BW1-6 [AMR-19]
BW1-7 [AMR-18]