2.9 KiB
2.9 KiB
sample_random_scans.sh run progress (checkpoint)
Snapshot from terminal session 9 (repo: /Users/igt/Documents/spruce_scraper), as of when the machine was about to be restarted. Date: 2026-04-26.
Active run (incomplete)
A full scan was in progress: mosaic + all tiles (worker count from config.yaml), with scan listing using --list-scans-first-page-only (one page, up to 320 scan IDs, uniform random choice among that page).
| Item | Value |
|---|---|
| Script | ./scripts/sample_random_scans.sh |
| Machines file | machines.txt (12 machines) |
| Config | config.yaml |
| State files | archives/scans.csv, archives/tiles.csv, archives/.progress.json |
Where it stopped
The run was on step [9/12], machine BW3-17 [AMR-20], scan ID 153772.
- Mosaic: HTTP 404 for
…/RootView_Database/153772/mosaic.jpg(same pattern as other scans: tiles still available). - Tiles: 33784 total; progress bar showed roughly 5% completed — last log line observed was on the order of ~1736 / 33784 tiles (exact count advances continuously; re-check
archives/.progress.jsonor resume to see current).
Not yet started in this full-scan pass: steps [10/12]–[12/12]: BW3-19 [AMR-21], BW3-20 [AMR-26], BW3-21 [AMR-17] (lines 12–14 of machines.txt).
Skipped machine in this pass
- [4/12] BW2-8 [AMR-25]:
SKIPPED—scraper.py --list-scans --list-scans-first-page-onlyexited with code 1 (could not get scan list or pick an ID). The script continued with the next machine.
Completed machines in this full-scan pass (steps 1–3, 5–8)
| Step | Machine | Scan ID | Mosaic | Tiles downloaded |
|---|---|---|---|---|
| 1 | BW1-4 [AMR-15] | 71478 | 404 | 56 |
| 2 | BW1-6 [AMR-19] | 156875 | saved | 72 |
| 3 | BW1-7 [AMR-18] | 10837 | 404 | 1170 |
| 4 | BW2-8 [AMR-25] | — | — | skipped |
| 5 | BW2-10 [AMR-22] | 146368 | saved | 156 |
| 6 | BW2-11 [AMR-23] | 160022 | saved | 529 |
| 7 | BW2-13 [AMR-24] | 156957 | saved | 143 |
| 8 | BW3-16 [AMR-16] | 77300 | 404 | 400 |
After restart
cdto the repo and activate the same venv as before.- Re-run
./scripts/sample_random_scans.shwith the same mode (full scan — default if that is what you used). The scraper resumes fromarchives/.progress.jsonand will continue BW3-17 scan 153772 (remaining tiles) before moving to later machines, unless you change options or data manually.
Other runs in the same log (for context)
- Earlier
DRY_FLAG[@]: unbound variableerrors from the script were fixed in later invocations. - A mosaic-only pass over all 12 machines completed with banner: 12 machine(s) with mosaic step completed, 0 skipped (random scan per machine from the first page of IDs). That is a separate completed run from the in-progress full scan above.