diff --git a/docs/sample_random_scans_run_progress.md b/docs/sample_random_scans_run_progress.md new file mode 100644 index 0000000..78bd1f4 --- /dev/null +++ b/docs/sample_random_scans_run_progress.md @@ -0,0 +1,50 @@ +# `sample_random_scans.sh` run progress (checkpoint) + +Snapshot from terminal session **9** (repo: `/Users/igt/Documents/spruce_scraper`), as of when the machine was about to be restarted. **Date:** 2026-04-26. + +## Active run (incomplete) + +A **full scan** was in progress: **mosaic + all tiles** (worker count from `config.yaml`), with scan listing using **`--list-scans-first-page-only`** (one page, up to 320 scan IDs, uniform random choice among that page). + +| Item | Value | +|------|--------| +| Script | `./scripts/sample_random_scans.sh` | +| Machines file | `machines.txt` (12 machines) | +| Config | `config.yaml` | +| State files | `archives/scans.csv`, `archives/tiles.csv`, `archives/.progress.json` | + +### Where it stopped + +The run was on **step [9/12]**, machine **BW3-17 [AMR-20]**, **scan ID 153772**. + +- **Mosaic:** HTTP **404** for `…/RootView_Database/153772/mosaic.jpg` (same pattern as other scans: tiles still available). +- **Tiles:** **33784** total; progress bar showed roughly **5%** completed — last log line observed was on the order of **~1736 / 33784** tiles (exact count advances continuously; re-check `archives/.progress.json` or resume to see current). + +**Not yet started** in this full-scan pass: steps **[10/12]–[12/12]**: **BW3-19 [AMR-21]**, **BW3-20 [AMR-26]**, **BW3-21 [AMR-17]** (lines 12–14 of `machines.txt`). + +### Skipped machine in this pass + +- **[4/12] BW2-8 [AMR-25]:** `SKIPPED` — `scraper.py --list-scans --list-scans-first-page-only` exited with **code 1** (could not get scan list or pick an ID). The script continued with the next machine. + +### Completed machines in this full-scan pass (steps 1–3, 5–8) + +| Step | Machine | Scan ID | Mosaic | Tiles downloaded | +|------|---------|---------|--------|------------------| +| 1 | BW1-4 [AMR-15] | 71478 | 404 | 56 | +| 2 | BW1-6 [AMR-19] | 156875 | saved | 72 | +| 3 | BW1-7 [AMR-18] | 10837 | 404 | 1170 | +| 4 | BW2-8 [AMR-25] | — | — | skipped | +| 5 | BW2-10 [AMR-22] | 146368 | saved | 156 | +| 6 | BW2-11 [AMR-23] | 160022 | saved | 529 | +| 7 | BW2-13 [AMR-24] | 156957 | saved | 143 | +| 8 | BW3-16 [AMR-16] | 77300 | 404 | 400 | + +## After restart + +1. `cd` to the repo and activate the same venv as before. +2. Re-run **`./scripts/sample_random_scans.sh`** with the **same mode** (full scan — default if that is what you used). The scraper **resumes** from `archives/.progress.json` and will continue **BW3-17** scan **153772** (remaining tiles) before moving to later machines, unless you change options or data manually. + +## Other runs in the same log (for context) + +- Earlier **`DRY_FLAG[@]: unbound variable`** errors from the script were fixed in later invocations. +- A **mosaic-only** pass over all 12 machines completed with banner: *12 machine(s) with mosaic step completed, 0 skipped* (random scan per machine from the first page of IDs). That is a **separate** completed run from the **in-progress full scan** above. diff --git a/investigate_manifest.csv b/investigate_manifest.csv new file mode 100644 index 0000000..3b4a7b5 --- /dev/null +++ b/investigate_manifest.csv @@ -0,0 +1,301 @@ +bucket,machine,scan_id,scan_dir +zero,BW3-19 [AMR-21],141127,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-12\141127 +zero,BW2-8 [AMR-25],22778,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-11-10\22778 +zero,BW1-6 [AMR-19],93870,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-11-23\93870 +zero,BW3-19 [AMR-21],140121,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-27\140121 +zero,BW3-19 [AMR-21],144191,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-21\144191 +zero,BW3-19 [AMR-21],144426,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-23\144426 +zero,BW3-19 [AMR-21],144659,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-26\144659 +zero,BW2-13 [AMR-24],120923,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-12-10\120923 +zero,BW3-19 [AMR-21],140154,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-27\140154 +zero,BW2-8 [AMR-25],23645,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-11-17\23645 +zero,BW3-19 [AMR-21],140792,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-07\140792 +zero,BW3-19 [AMR-21],140125,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-27\140125 +zero,BW3-19 [AMR-21],141927,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-22\141927 +zero,BW2-8 [AMR-25],118438,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2022-10-30\118438 +zero,BW3-19 [AMR-21],141575,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-18\141575 +zero,BW3-19 [AMR-21],142951,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-04\142951 +zero,BW1-6 [AMR-19],90874,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-10-21\90874 +zero,BW1-6 [AMR-19],91489,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-10-27\91489 +zero,BW2-8 [AMR-25],44836,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\unknown\44836 +zero,BW3-19 [AMR-21],144692,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-26\144692 +zero,BW3-19 [AMR-21],144584,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-25\144584 +zero,BW3-19 [AMR-21],142238,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-26\142238 +zero,BW3-19 [AMR-21],141485,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-17\141485 +zero,BW1-6 [AMR-19],92123,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-11-02\92123 +zero,BW3-19 [AMR-21],141805,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-20\141805 +zero,BW3-19 [AMR-21],144856,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-29\144856 +zero,BW3-19 [AMR-21],140325,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-29\140325 +zero,BW3-19 [AMR-21],141026,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-11\141026 +zero,BW3-19 [AMR-21],140419,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-30\140419 +zero,BW3-19 [AMR-21],142969,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-04\142969 +zero,BW3-19 [AMR-21],144681,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-26\144681 +zero,BW3-19 [AMR-21],142677,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-01\142677 +zero,BW3-19 [AMR-21],141584,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-18\141584 +zero,BW3-19 [AMR-21],144159,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-19\144159 +zero,BW3-19 [AMR-21],139494,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-19\139494 +zero,BW1-6 [AMR-19],99248,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-02-03\99248 +zero,BW3-19 [AMR-21],139969,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-24\139969 +zero,BW3-19 [AMR-21],139511,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-19\139511 +zero,BW3-17 [AMR-20],153019,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2024-03-25\153019 +zero,BW3-19 [AMR-21],140463,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-01\140463 +zero,BW3-19 [AMR-21],143587,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-12\143587 +zero,BW3-17 [AMR-20],153493,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2024-04-01\153493 +zero,BW3-19 [AMR-21],144727,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-28\144727 +zero,BW3-19 [AMR-21],139946,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-24\139946 +zero,BW3-19 [AMR-21],143612,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-12\143612 +zero,BW2-8 [AMR-25],83393,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2021-07-18\83393 +zero,BW3-19 [AMR-21],143288,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-09\143288 +zero,BW2-8 [AMR-25],23902,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-11-19\23902 +zero,BW3-19 [AMR-21],143445,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-10\143445 +zero,BW3-19 [AMR-21],140154,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-27\140154 +tiny,BW2-13 [AMR-24],26852,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2019-12-15\26852 +tiny,BW2-13 [AMR-24],140181,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-09-28\140181 +tiny,BW1-6 [AMR-19],114819,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-09-16\114819 +tiny,BW3-21 [AMR-17],97824,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2022-01-15\97824 +tiny,BW3-21 [AMR-17],52014,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-08-27\52014 +tiny,BW2-8 [AMR-25],127445,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2023-03-30\127445 +tiny,BW3-19 [AMR-21],48940,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-07-24\48940 +tiny,BW1-6 [AMR-19],87810,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-09-19\87810 +tiny,BW3-21 [AMR-17],43092,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-05-14\43092 +tiny,BW2-13 [AMR-24],113334,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-08-18\113334 +tiny,BW3-19 [AMR-21],59127,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-11-12\59127 +tiny,BW3-21 [AMR-17],25737,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2019-12-05\25737 +tiny,BW2-10 [AMR-22],61950,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-12-10\61950 +tiny,BW1-6 [AMR-19],93265,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-11-13\93265 +tiny,BW1-6 [AMR-19],113849,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-09-02\113849 +tiny,BW2-11 [AMR-23],124373,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-02-21\124373 +tiny,BW2-13 [AMR-24],120371,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-11-29\120371 +tiny,BW1-6 [AMR-19],87277,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-09-14\87277 +tiny,BW2-11 [AMR-23],122855,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-02-03\122855 +tiny,BW1-6 [AMR-19],69086,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-02-21\69086 +tiny,BW3-19 [AMR-21],47993,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-07-15\47993 +tiny,BW2-13 [AMR-24],125103,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-03-02\125103 +tiny,BW3-21 [AMR-17],103344,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2022-03-25\103344 +tiny,BW3-19 [AMR-21],57723,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-10-23\57723 +tiny,BW2-8 [AMR-25],79195,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2021-06-06\79195 +tiny,BW3-19 [AMR-21],54692,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-09-19\54692 +tiny,BW3-16 [AMR-16],30599,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-01-19\30599 +tiny,BW2-11 [AMR-23],130942,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-05-19\130942 +tiny,BW2-13 [AMR-24],138601,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-09-07\138601 +tiny,BW1-6 [AMR-19],92258,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-11-03\92258 +tiny,BW2-8 [AMR-25],23181,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-11-13\23181 +tiny,BW3-21 [AMR-17],53547,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-09-09\53547 +tiny,BW2-13 [AMR-24],155307,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2024-04-28\155307 +tiny,BW2-8 [AMR-25],72356,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2021-03-27\72356 +tiny,BW3-21 [AMR-17],95618,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2021-12-16\95618 +tiny,BW3-19 [AMR-21],48393,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-07-18\48393 +tiny,BW2-13 [AMR-24],130075,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-05-04\130075 +tiny,BW3-21 [AMR-17],39758,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-04-14\39758 +tiny,BW2-11 [AMR-23],126894,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-03-23\126894 +tiny,BW2-13 [AMR-24],82264,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2021-07-07\82264 +tiny,BW1-6 [AMR-19],99228,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-02-03\99228 +tiny,BW2-11 [AMR-23],124000,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-02-17\124000 +tiny,BW1-4 [AMR-15],46063,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-06-18\46063 +tiny,BW2-13 [AMR-24],93211,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2021-11-13\93211 +tiny,BW3-20 [AMR-26],87312,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2021-09-14\87312 +tiny,BW2-13 [AMR-24],131348,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-05-25\131348 +tiny,BW1-6 [AMR-19],94711,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-12-03\94711 +tiny,BW2-11 [AMR-23],129519,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-04-23\129519 +tiny,BW3-21 [AMR-17],32767,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-02-08\32767 +tiny,BW2-13 [AMR-24],93571,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2021-11-19\93571 +small,BW2-11 [AMR-23],158199,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2024-07-21\158199 +small,BW3-19 [AMR-21],96770,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2021-12-31\96770 +small,BW2-13 [AMR-24],47488,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-07-09\47488 +small,BW3-19 [AMR-21],152767,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2024-03-21\152767 +small,BW2-10 [AMR-22],129800,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-04-27\129800 +small,BW2-11 [AMR-23],114702,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2022-09-15\114702 +small,BW2-10 [AMR-22],135212,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-07-21\135212 +small,BW2-11 [AMR-23],136572,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-08-09\136572 +small,BW3-20 [AMR-26],145231,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2023-12-03\145231 +small,BW3-19 [AMR-21],32001,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-02-01\32001 +small,BW2-11 [AMR-23],116124,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2022-10-01\116124 +small,BW3-20 [AMR-26],120928,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-12-10\120928 +small,BW3-16 [AMR-16],56581,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-10-08\56581 +small,BW3-20 [AMR-26],123441,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2023-02-10\123441 +small,BW2-13 [AMR-24],41468,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-04-29\41468 +small,BW2-11 [AMR-23],19698,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2019-10-13\19698 +small,BW2-11 [AMR-23],154592,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2024-04-18\154592 +small,BW2-10 [AMR-22],137156,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-08-16\137156 +small,BW3-19 [AMR-21],85449,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2021-08-20\85449 +small,BW3-19 [AMR-21],102824,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2022-03-19\102824 +small,BW1-6 [AMR-19],54986,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2020-09-22\54986 +small,BW1-6 [AMR-19],135364,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2023-07-23\135364 +small,BW1-6 [AMR-19],28609,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2020-01-01\28609 +small,BW2-10 [AMR-22],115991,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-09-29\115991 +small,BW3-20 [AMR-26],28596,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-01-01\28596 +small,BW2-10 [AMR-22],106310,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-04-28\106310 +small,BW3-16 [AMR-16],65871,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2021-01-19\65871 +small,BW3-20 [AMR-26],103751,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-03-29\103751 +small,BW1-6 [AMR-19],118031,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-10-26\118031 +small,BW2-13 [AMR-24],112247,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-07-20\112247 +small,BW2-13 [AMR-24],118274,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-10-28\118274 +small,BW3-20 [AMR-26],104298,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-04-03\104298 +small,BW3-19 [AMR-21],130200,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-05-09\130200 +small,BW3-19 [AMR-21],59385,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-11-15\59385 +small,BW2-11 [AMR-23],132767,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-06-14\132767 +small,BW3-20 [AMR-26],152753,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-03-21\152753 +small,BW1-4 [AMR-15],31573,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-01-28\31573 +small,BW1-6 [AMR-19],21993,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2019-11-03\21993 +small,BW3-19 [AMR-21],34801,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-02-27\34801 +small,BW2-11 [AMR-23],108563,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2022-05-22\108563 +small,BW3-21 [AMR-17],15863,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2019-09-08\15863 +small,BW2-11 [AMR-23],38719,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2020-04-03\38719 +small,BW1-6 [AMR-19],26196,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2019-12-10\26196 +small,BW2-11 [AMR-23],90722,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2021-10-19\90722 +small,BW3-16 [AMR-16],47187,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-07-04\47187 +small,BW2-10 [AMR-22],110531,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-06-16\110531 +small,BW3-16 [AMR-16],11297,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2015-11-01\11297 +small,BW1-4 [AMR-15],43503,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-05-17\43503 +small,BW2-11 [AMR-23],115701,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2022-09-25\115701 +small,BW3-19 [AMR-21],95504,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2021-12-14\95504 +medium,BW2-10 [AMR-22],86562,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-09-04\86562 +medium,BW3-20 [AMR-26],38929,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-04-05\38929 +medium,BW2-10 [AMR-22],125087,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-03-02\125087 +medium,BW2-13 [AMR-24],119980,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-11-20\119980 +medium,BW2-10 [AMR-22],74116,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-04-14\74116 +medium,BW2-10 [AMR-22],101557,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-03-05\101557 +medium,BW3-20 [AMR-26],148093,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-01-13\148093 +medium,BW2-10 [AMR-22],97238,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-01-07\97238 +medium,BW3-20 [AMR-26],23171,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-11-13\23171 +medium,BW3-20 [AMR-26],28007,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-12-26\28007 +medium,BW1-4 [AMR-15],52288,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-08-29\52288 +medium,BW3-16 [AMR-16],66638,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2021-01-28\66638 +medium,BW3-20 [AMR-26],54374,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-09-16\54374 +medium,BW2-8 [AMR-25],158079,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2024-07-16\158079 +medium,BW2-10 [AMR-22],122216,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-01-18\122216 +medium,BW3-20 [AMR-26],151922,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-03-08\151922 +medium,BW2-13 [AMR-24],47678,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-07-11\47678 +medium,BW2-10 [AMR-22],32062,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-02-01\32062 +medium,BW2-8 [AMR-25],60826,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2020-11-29\60826 +medium,BW2-10 [AMR-22],31095,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-01-24\31095 +medium,BW2-10 [AMR-22],144344,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-11-22\144344 +medium,BW2-10 [AMR-22],140013,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-09-26\140013 +medium,BW3-20 [AMR-26],55608,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-09-27\55608 +medium,BW2-8 [AMR-25],17697,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-09-25\17697 +medium,BW3-20 [AMR-26],26794,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-12-14\26794 +medium,BW2-10 [AMR-22],114464,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-09-11\114464 +medium,BW2-10 [AMR-22],113595,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-08-26\113595 +medium,BW3-20 [AMR-26],59494,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-11-17\59494 +medium,BW3-20 [AMR-26],17595,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-09-24\17595 +medium,BW2-10 [AMR-22],95535,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-12-15\95535 +medium,BW2-11 [AMR-23],159024,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2024-11-14\159024 +medium,BW3-20 [AMR-26],29326,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-01-08\29326 +medium,BW3-20 [AMR-26],129738,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2023-04-27\129738 +medium,BW2-10 [AMR-22],49731,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-08-01\49731 +medium,BW3-20 [AMR-26],23196,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-11-14\23196 +medium,BW2-10 [AMR-22],72647,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-03-30\72647 +medium,BW2-13 [AMR-24],39157,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-04-08\39157 +medium,BW3-20 [AMR-26],138785,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2023-09-09\138785 +medium,BW3-20 [AMR-26],148250,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-01-16\148250 +medium,BW3-20 [AMR-26],119471,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-11-12\119471 +medium,BW3-20 [AMR-26],34470,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-02-23\34470 +medium,BW3-20 [AMR-26],109734,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-06-07\109734 +medium,BW2-10 [AMR-22],116997,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-10-13\116997 +medium,BW2-10 [AMR-22],26076,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2019-12-08\26076 +medium,BW2-10 [AMR-22],42501,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-05-08\42501 +medium,BW2-8 [AMR-25],52036,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2020-08-27\52036 +medium,BW3-16 [AMR-16],37365,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-03-22\37365 +medium,BW2-8 [AMR-25],157670,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2024-06-25\157670 +medium,BW3-20 [AMR-26],15419,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-09-04\15419 +medium,BW2-10 [AMR-22],38651,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-04-03\38651 +large,BW3-20 [AMR-26],63054,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-12-21\63054 +large,BW2-10 [AMR-22],12990,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2018-10-15\12990 +large,BW1-4 [AMR-15],71109,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2021-03-15\71109 +large,BW1-4 [AMR-15],10715,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-05-01\10715 +large,BW3-21 [AMR-17],12185,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2017-03-27\12185 +large,BW1-4 [AMR-15],10907,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-07\10907 +large,BW2-10 [AMR-22],12693,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2018-03-12\12693 +large,BW1-4 [AMR-15],10898,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-05\10898 +large,BW1-4 [AMR-15],49214,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-07-27\49214 +large,BW2-11 [AMR-23],12552,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2017-12-04\12552 +large,BW3-17 [AMR-20],10937,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2015-06-18\10937 +large,BW2-10 [AMR-22],12353,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2017-08-11\12353 +large,BW2-11 [AMR-23],142004,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-10-23\142004 +large,BW3-17 [AMR-20],10100,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2014-06-16\10100 +large,BW3-17 [AMR-20],89168,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2021-10-04\89168 +large,BW2-8 [AMR-25],10377,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2014-11-24\10377 +large,BW3-19 [AMR-21],13055,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2018-11-19\13055 +large,BW1-6 [AMR-19],10620,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2015-03-25\10620 +large,BW3-20 [AMR-26],75333,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2021-04-26\75333 +large,BW3-20 [AMR-26],71107,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2021-03-15\71107 +large,BW3-17 [AMR-20],157907,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2024-07-05\157907 +large,BW2-10 [AMR-22],10925,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2015-06-15\10925 +large,BW2-13 [AMR-24],13017,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2018-10-30\13017 +large,BW2-8 [AMR-25],152547,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2024-03-18\152547 +large,BW1-6 [AMR-19],13004,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2018-10-21\13004 +large,BW1-6 [AMR-19],12934,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2018-08-20\12934 +large,BW2-13 [AMR-24],150086,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2024-02-12\150086 +large,BW3-16 [AMR-16],29192,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-01-06\29192 +large,BW2-13 [AMR-24],150620,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2024-02-19\150620 +large,BW2-13 [AMR-24],10137,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2014-07-07\10137 +large,BW2-13 [AMR-24],12969,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2018-09-10\12969 +large,BW3-16 [AMR-16],10129,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2014-06-30\10129 +large,BW1-4 [AMR-15],10930,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-16\10930 +large,BW1-4 [AMR-15],60897,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-11-30\60897 +large,BW3-16 [AMR-16],13042,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2018-11-12\13042 +large,BW1-4 [AMR-15],54939,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-09-21\54939 +large,BW1-6 [AMR-19],12922,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2018-08-13\12922 +large,BW1-4 [AMR-15],10905,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-06\10905 +large,BW2-13 [AMR-24],13104,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2018-12-31\13104 +large,BW2-11 [AMR-23],10177,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2014-07-24\10177 +large,BW1-6 [AMR-19],12492,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2017-10-23\12492 +large,BW2-10 [AMR-22],10647,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2015-03-30\10647 +large,BW2-8 [AMR-25],65492,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2021-01-14\65492 +large,BW3-19 [AMR-21],13259,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2019-04-08\13259 +large,BW3-16 [AMR-16],13105,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2018-12-31\13105 +large,BW1-6 [AMR-19],10002,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2014-02-10\10002 +large,BW2-13 [AMR-24],10176,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2014-07-24\10176 +large,BW1-7 [AMR-18],10312,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2014-10-27\10312 +large,BW3-16 [AMR-16],11143,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2015-08-04\11143 +large,BW2-10 [AMR-22],10302,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2014-10-10\10302 +xlarge,BW1-6 [AMR-19],157995,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2024-07-12\157995 +xlarge,BW2-13 [AMR-24],157337,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2024-06-10\157337 +xlarge,BW3-21 [AMR-17],12676,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2018-02-26\12676 +xlarge,BW3-16 [AMR-16],10666,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2015-04-14\10666 +xlarge,BW1-6 [AMR-19],74657,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-04-19\74657 +xlarge,BW1-4 [AMR-15],10921,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-15\10921 +xlarge,BW3-19 [AMR-21],43555,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-05-18\43555 +xlarge,BW3-16 [AMR-16],11988,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2016-12-07\11988 +xlarge,BW3-21 [AMR-17],12906,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2018-07-17\12906 +xlarge,BW1-4 [AMR-15],13280,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2019-04-22\13280 +xlarge,BW1-6 [AMR-19],111563,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-07-04\111563 +xlarge,BW3-20 [AMR-26],12941,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2018-08-20\12941 +xlarge,BW2-8 [AMR-25],13126,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-01-14\13126 +xlarge,BW1-7 [AMR-18],112645,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2022-07-29\112645 +xlarge,BW2-11 [AMR-23],12581,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2017-12-27\12581 +xlarge,BW2-13 [AMR-24],12034,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2017-01-03\12034 +xlarge,BW2-13 [AMR-24],12260,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2017-06-05\12260 +xlarge,BW1-7 [AMR-18],10065,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2014-05-05\10065 +xlarge,BW2-11 [AMR-23],13229,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2019-03-25\13229 +xlarge,BW1-4 [AMR-15],10196,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2014-08-04\10196 +xlarge,BW1-7 [AMR-18],122844,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2023-02-03\122844 +xlarge,BW2-11 [AMR-23],83433,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2021-07-19\83433 +xlarge,BW1-4 [AMR-15],43558,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-05-18\43558 +xlarge,BW2-11 [AMR-23],38997,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2020-04-06\38997 +xlarge,BW2-8 [AMR-25],10325,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2014-11-03\10325 +xlarge,BW3-20 [AMR-26],10356,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2014-11-17\10356 +xlarge,BW3-20 [AMR-26],10306,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2014-10-10\10306 +xlarge,BW2-13 [AMR-24],47870,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-07-13\47870 +xlarge,BW2-10 [AMR-22],113242,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-08-15\113242 +xlarge,BW2-11 [AMR-23],11477,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2016-02-15\11477 +xlarge,BW3-19 [AMR-21],11185,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2015-08-24\11185 +xlarge,BW3-20 [AMR-26],62336,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-12-14\62336 +xlarge,BW3-20 [AMR-26],10454,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2015-01-05\10454 +xlarge,BW3-16 [AMR-16],10329,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2014-11-03\10329 +xlarge,BW3-19 [AMR-21],13342,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2019-05-28\13342 +xlarge,BW3-20 [AMR-26],148596,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-01-22\148596 +xlarge,BW2-13 [AMR-24],11987,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2016-12-07\11987 +xlarge,BW1-7 [AMR-18],157743,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2024-06-28\157743 +xlarge,BW1-7 [AMR-18],11852,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2016-08-30\11852 +xlarge,BW2-10 [AMR-22],85215,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-08-16\85215 +xlarge,BW1-7 [AMR-18],8572,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2014-01-06\8572 +xlarge,BW1-4 [AMR-15],10206,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2014-08-11\10206 +xlarge,BW3-17 [AMR-20],13191,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2019-02-25\13191 +xlarge,BW3-20 [AMR-26],42786,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-05-11\42786 +xlarge,BW3-16 [AMR-16],11901,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2016-10-03\11901 +xlarge,BW1-4 [AMR-15],10073,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2014-05-19\10073 +xlarge,BW3-20 [AMR-26],13278,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-04-16\13278 +xlarge,BW2-10 [AMR-22],19711,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2019-10-14\19711 +xlarge,BW1-4 [AMR-15],10256,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2014-09-15\10256 +xlarge,BW1-4 [AMR-15],11035,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-29\11035 diff --git a/scripts/sample_random_scans.sh b/scripts/sample_random_scans.sh new file mode 100755 index 0000000..dab2acd --- /dev/null +++ b/scripts/sample_random_scans.sh @@ -0,0 +1,178 @@ +#!/usr/bin/env bash +# For each machine label in a text file, pick one random completed scan and download +# it: by default the mosaic and all tiles (same as: --machine "…" --scan-id N). +# For mosaic only (faster, no tile downloads), set: MOSAIC_ONLY=1 +# +# Usage: +# ./scripts/sample_random_scans.sh [PATH_TO_machines.txt] +# Config path defaults to config.yaml in the repo root. Override with: +# CONFIG=/path/to/config.yaml ./scripts/sample_random_scans.sh machines.txt +# Dry-run the download step (listing still does real HTTP to fetch scan list): +# DRY_RUN=1 ./scripts/sample_random_scans.sh machines.txt +# Verbose / debug (extra per-step lines, scan counts from the list step): +# DEBUG=1 ./scripts/sample_random_scans.sh machines.txt +# By default, --list-scans fetches only the first page (one HTTP request, up to +# 320 scans). To paginate the full archive for the random pick (slower when many +# LIST_SCANS_ALL_PAGES=1 ./scripts/sample_random_scans.sh machines.txt +# +# machines.txt: one machine label per line (same as --machine and config machine names). +# See scripts/machines.example.txt + +set -euo pipefail + +REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +CONFIG="${CONFIG:-$REPO_ROOT/config.yaml}" +MACHINES_FILE="${1:-$REPO_ROOT/machines.txt}" +SCRAPER=(python3 "$REPO_ROOT/scraper.py" --config "$CONFIG") + +log() { echo "[sample_random_scans] $*" >&2; } +log_debug() { + if [[ -n "${DEBUG:-}" ]]; then + echo "[sample_random_scans] debug: $*" >&2 + fi +} + +if [[ ! -f "$MACHINES_FILE" ]]; then + log "error: file not found: $MACHINES_FILE" + log "Create it with one machine label per line, or: cp scripts/machines.example.txt machines.txt" + exit 1 +fi + +if [[ ! -f "$CONFIG" ]]; then + log "error: config not found: $CONFIG" + exit 1 +fi + +# Non-empty, non-comment lines (same rules as the main loop) +TOTAL_MACHINES="$( + grep -v '^[[:space:]]*#' "$MACHINES_FILE" | grep -c -v '^[[:space:]]*$' || true +)" +if [[ -z "$TOTAL_MACHINES" || "$TOTAL_MACHINES" -eq 0 ]]; then + log "error: no machine lines in: $MACHINES_FILE" + exit 1 +fi + +log "starting repo=$REPO_ROOT" +log " config=$CONFIG" +log " machines_file=$MACHINES_FILE (${TOTAL_MACHINES} machine(s) in file)" +if [[ -n "${MOSAIC_ONLY:-}" ]]; then + if [[ -n "${DRY_RUN:-}" ]]; then + log " mode: MOSAIC_ONLY + DRY_RUN (mosaic only, --dry-run on download step)" + else + log " mode: MOSAIC_ONLY=1 (mosaics only, no tiles; use for a lighter sample)" + fi +else + if [[ -n "${DRY_RUN:-}" ]]; then + log " mode: DRY_RUN (list + full scan download use --dry-run; no files written)" + else + log " mode: full scan — mosaic + all tiles (workers from config)" + fi +fi +if [[ -n "${DEBUG:-}" ]]; then + log " DEBUG=1 (extra diagnostics enabled)" +fi +if [[ -n "${LIST_SCANS_ALL_PAGES:-}" ]]; then + log " list step: list-scans = full archive (all pages, slower)" +else + log " list step: list-scans --list-scans-first-page-only (one page, up to 320 IDs)" +fi +log "────────────────────────────────────────" + +export REPO_ROOT CONFIG +[[ -n "${DEBUG:-}" ]] && export DEBUG +[[ -n "${LIST_SCANS_ALL_PAGES:-}" ]] && export LIST_SCANS_ALL_PAGES + +PROCESSED=0 +SKIPPED=0 +IDX=0 + +while IFS= read -r line || [[ -n "${line-}" ]]; do + # trim, strip CR, skip blanks / comments + line="${line//$'\r'/}" + label="${line#"${line%%[![:space:]]*}"}" + label="${label%"${label##*[![:space:]]}"}" + [[ -z "$label" || "$label" == \#* ]] && continue + + IDX=$((IDX + 1)) + log "[$IDX/$TOTAL_MACHINES] machine: $label" + log " status: listing scans (--list-scans) …" + + random_id="$( + REPO_ROOT="$REPO_ROOT" CONFIG="$CONFIG" LABEL="$label" python3 - <<'PY' +import os, random, subprocess, sys + +label = os.environ["LABEL"] +repo = os.environ["REPO_ROOT"] +cfg = os.environ["CONFIG"] +debug = bool(os.environ.get("DEBUG")) +full = bool(os.environ.get("LIST_SCANS_ALL_PAGES")) +scraper = os.path.join(repo, "scraper.py") +if debug: + print( + f"[sample_random_scans] debug: running list-scans for {label!r} " + f"({'all pages' if full else 'first page only'})", + file=sys.stderr, + ) +cmd = [sys.executable, scraper, "--list-scans", "--machine", label, "--config", cfg] +if not full: + cmd.insert(3, "--list-scans-first-page-only") +out = subprocess.check_output( + cmd, + text=True, + stderr=subprocess.STDOUT, +) +ids = [] +for line in out.splitlines(): + line = line.rstrip() + if not line or line.startswith("---") or "Total" in line: + continue + parts = line.split() + if parts and parts[0].isdigit(): + ids.append(parts[0]) +if not ids: + print(f"no scans parsed for {label!r} — check login and output", file=sys.stderr) + sys.exit(1) +if debug: + print( + f"[sample_random_scans] debug: parsed {len(ids)} scan id(s) for {label!r}", + file=sys.stderr, + ) +print(random.choice(ids), end="") +PY + )" || { + log " status: SKIPPED (could not get scan list or pick id)" + SKIPPED=$((SKIPPED + 1)) + continue + } + + log " status: picked random scan_id=$random_id (uniform among IDs from this list step — first page by default, see start banner)" + if [[ -n "${MOSAIC_ONLY:-}" ]]; then + log " status: running scraper: --mosaic-only --scan-id (mosaic only) …" + else + log " status: running scraper: --scan-id (mosaic + tiles) …" + fi + if [[ -n "${DRY_RUN:-}" ]]; then + log " status: (dry-run — no files written for this scan)" + fi + + if [[ -n "${MOSAIC_ONLY:-}" ]]; then + run_cmd=("${SCRAPER[@]}" --mosaic-only --machine "$label" --scan-id "$random_id") + else + run_cmd=("${SCRAPER[@]}" --machine "$label" --scan-id "$random_id") + fi + if [[ -n "${DRY_RUN:-}" ]]; then + run_cmd+=(--dry-run) + fi + if "${run_cmd[@]}"; then + log " status: OK — finished this machine (exit 0)" + PROCESSED=$((PROCESSED + 1)) + else + rc=$? + log " status: FAILED — scraper exit code $rc (stopping; fix or remove this machine and re-run)" + exit "$rc" + fi + log "────────────────────────────────────────" +done < "$MACHINES_FILE" + +log "done. summary: $PROCESSED machine(s) with sampled scan download completed, $SKIPPED skipped, $IDX line(s) processed out of $TOTAL_MACHINES in file." +exit 0