Compare commits
2 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 8593808cf3 | |||
| 6390f5d529 |
@@ -1,50 +0,0 @@
|
|||||||
# `sample_random_scans.sh` run progress (checkpoint)
|
|
||||||
|
|
||||||
Snapshot from terminal session **9** (repo: `/Users/igt/Documents/spruce_scraper`), as of when the machine was about to be restarted. **Date:** 2026-04-26.
|
|
||||||
|
|
||||||
## Active run (incomplete)
|
|
||||||
|
|
||||||
A **full scan** was in progress: **mosaic + all tiles** (worker count from `config.yaml`), with scan listing using **`--list-scans-first-page-only`** (one page, up to 320 scan IDs, uniform random choice among that page).
|
|
||||||
|
|
||||||
| Item | Value |
|
|
||||||
|------|--------|
|
|
||||||
| Script | `./scripts/sample_random_scans.sh` |
|
|
||||||
| Machines file | `machines.txt` (12 machines) |
|
|
||||||
| Config | `config.yaml` |
|
|
||||||
| State files | `archives/scans.csv`, `archives/tiles.csv`, `archives/.progress.json` |
|
|
||||||
|
|
||||||
### Where it stopped
|
|
||||||
|
|
||||||
The run was on **step [9/12]**, machine **BW3-17 [AMR-20]**, **scan ID 153772**.
|
|
||||||
|
|
||||||
- **Mosaic:** HTTP **404** for `…/RootView_Database/153772/mosaic.jpg` (same pattern as other scans: tiles still available).
|
|
||||||
- **Tiles:** **33784** total; progress bar showed roughly **5%** completed — last log line observed was on the order of **~1736 / 33784** tiles (exact count advances continuously; re-check `archives/.progress.json` or resume to see current).
|
|
||||||
|
|
||||||
**Not yet started** in this full-scan pass: steps **[10/12]–[12/12]**: **BW3-19 [AMR-21]**, **BW3-20 [AMR-26]**, **BW3-21 [AMR-17]** (lines 12–14 of `machines.txt`).
|
|
||||||
|
|
||||||
### Skipped machine in this pass
|
|
||||||
|
|
||||||
- **[4/12] BW2-8 [AMR-25]:** `SKIPPED` — `scraper.py --list-scans --list-scans-first-page-only` exited with **code 1** (could not get scan list or pick an ID). The script continued with the next machine.
|
|
||||||
|
|
||||||
### Completed machines in this full-scan pass (steps 1–3, 5–8)
|
|
||||||
|
|
||||||
| Step | Machine | Scan ID | Mosaic | Tiles downloaded |
|
|
||||||
|------|---------|---------|--------|------------------|
|
|
||||||
| 1 | BW1-4 [AMR-15] | 71478 | 404 | 56 |
|
|
||||||
| 2 | BW1-6 [AMR-19] | 156875 | saved | 72 |
|
|
||||||
| 3 | BW1-7 [AMR-18] | 10837 | 404 | 1170 |
|
|
||||||
| 4 | BW2-8 [AMR-25] | — | — | skipped |
|
|
||||||
| 5 | BW2-10 [AMR-22] | 146368 | saved | 156 |
|
|
||||||
| 6 | BW2-11 [AMR-23] | 160022 | saved | 529 |
|
|
||||||
| 7 | BW2-13 [AMR-24] | 156957 | saved | 143 |
|
|
||||||
| 8 | BW3-16 [AMR-16] | 77300 | 404 | 400 |
|
|
||||||
|
|
||||||
## After restart
|
|
||||||
|
|
||||||
1. `cd` to the repo and activate the same venv as before.
|
|
||||||
2. Re-run **`./scripts/sample_random_scans.sh`** with the **same mode** (full scan — default if that is what you used). The scraper **resumes** from `archives/.progress.json` and will continue **BW3-17** scan **153772** (remaining tiles) before moving to later machines, unless you change options or data manually.
|
|
||||||
|
|
||||||
## Other runs in the same log (for context)
|
|
||||||
|
|
||||||
- Earlier **`DRY_FLAG[@]: unbound variable`** errors from the script were fixed in later invocations.
|
|
||||||
- A **mosaic-only** pass over all 12 machines completed with banner: *12 machine(s) with mosaic step completed, 0 skipped* (random scan per machine from the first page of IDs). That is a **separate** completed run from the **in-progress full scan** above.
|
|
||||||
@@ -1,301 +0,0 @@
|
|||||||
bucket,machine,scan_id,scan_dir
|
|
||||||
zero,BW3-19 [AMR-21],141127,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-12\141127
|
|
||||||
zero,BW2-8 [AMR-25],22778,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-11-10\22778
|
|
||||||
zero,BW1-6 [AMR-19],93870,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-11-23\93870
|
|
||||||
zero,BW3-19 [AMR-21],140121,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-27\140121
|
|
||||||
zero,BW3-19 [AMR-21],144191,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-21\144191
|
|
||||||
zero,BW3-19 [AMR-21],144426,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-23\144426
|
|
||||||
zero,BW3-19 [AMR-21],144659,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-26\144659
|
|
||||||
zero,BW2-13 [AMR-24],120923,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-12-10\120923
|
|
||||||
zero,BW3-19 [AMR-21],140154,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-27\140154
|
|
||||||
zero,BW2-8 [AMR-25],23645,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-11-17\23645
|
|
||||||
zero,BW3-19 [AMR-21],140792,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-07\140792
|
|
||||||
zero,BW3-19 [AMR-21],140125,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-27\140125
|
|
||||||
zero,BW3-19 [AMR-21],141927,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-22\141927
|
|
||||||
zero,BW2-8 [AMR-25],118438,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2022-10-30\118438
|
|
||||||
zero,BW3-19 [AMR-21],141575,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-18\141575
|
|
||||||
zero,BW3-19 [AMR-21],142951,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-04\142951
|
|
||||||
zero,BW1-6 [AMR-19],90874,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-10-21\90874
|
|
||||||
zero,BW1-6 [AMR-19],91489,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-10-27\91489
|
|
||||||
zero,BW2-8 [AMR-25],44836,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\unknown\44836
|
|
||||||
zero,BW3-19 [AMR-21],144692,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-26\144692
|
|
||||||
zero,BW3-19 [AMR-21],144584,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-25\144584
|
|
||||||
zero,BW3-19 [AMR-21],142238,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-26\142238
|
|
||||||
zero,BW3-19 [AMR-21],141485,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-17\141485
|
|
||||||
zero,BW1-6 [AMR-19],92123,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-11-02\92123
|
|
||||||
zero,BW3-19 [AMR-21],141805,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-20\141805
|
|
||||||
zero,BW3-19 [AMR-21],144856,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-29\144856
|
|
||||||
zero,BW3-19 [AMR-21],140325,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-29\140325
|
|
||||||
zero,BW3-19 [AMR-21],141026,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-11\141026
|
|
||||||
zero,BW3-19 [AMR-21],140419,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-30\140419
|
|
||||||
zero,BW3-19 [AMR-21],142969,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-04\142969
|
|
||||||
zero,BW3-19 [AMR-21],144681,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-26\144681
|
|
||||||
zero,BW3-19 [AMR-21],142677,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-01\142677
|
|
||||||
zero,BW3-19 [AMR-21],141584,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-18\141584
|
|
||||||
zero,BW3-19 [AMR-21],144159,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-19\144159
|
|
||||||
zero,BW3-19 [AMR-21],139494,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-19\139494
|
|
||||||
zero,BW1-6 [AMR-19],99248,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-02-03\99248
|
|
||||||
zero,BW3-19 [AMR-21],139969,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-24\139969
|
|
||||||
zero,BW3-19 [AMR-21],139511,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-19\139511
|
|
||||||
zero,BW3-17 [AMR-20],153019,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2024-03-25\153019
|
|
||||||
zero,BW3-19 [AMR-21],140463,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-10-01\140463
|
|
||||||
zero,BW3-19 [AMR-21],143587,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-12\143587
|
|
||||||
zero,BW3-17 [AMR-20],153493,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2024-04-01\153493
|
|
||||||
zero,BW3-19 [AMR-21],144727,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-28\144727
|
|
||||||
zero,BW3-19 [AMR-21],139946,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-24\139946
|
|
||||||
zero,BW3-19 [AMR-21],143612,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-12\143612
|
|
||||||
zero,BW2-8 [AMR-25],83393,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2021-07-18\83393
|
|
||||||
zero,BW3-19 [AMR-21],143288,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-09\143288
|
|
||||||
zero,BW2-8 [AMR-25],23902,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-11-19\23902
|
|
||||||
zero,BW3-19 [AMR-21],143445,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-11-10\143445
|
|
||||||
zero,BW3-19 [AMR-21],140154,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-09-27\140154
|
|
||||||
tiny,BW2-13 [AMR-24],26852,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2019-12-15\26852
|
|
||||||
tiny,BW2-13 [AMR-24],140181,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-09-28\140181
|
|
||||||
tiny,BW1-6 [AMR-19],114819,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-09-16\114819
|
|
||||||
tiny,BW3-21 [AMR-17],97824,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2022-01-15\97824
|
|
||||||
tiny,BW3-21 [AMR-17],52014,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-08-27\52014
|
|
||||||
tiny,BW2-8 [AMR-25],127445,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2023-03-30\127445
|
|
||||||
tiny,BW3-19 [AMR-21],48940,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-07-24\48940
|
|
||||||
tiny,BW1-6 [AMR-19],87810,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-09-19\87810
|
|
||||||
tiny,BW3-21 [AMR-17],43092,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-05-14\43092
|
|
||||||
tiny,BW2-13 [AMR-24],113334,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-08-18\113334
|
|
||||||
tiny,BW3-19 [AMR-21],59127,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-11-12\59127
|
|
||||||
tiny,BW3-21 [AMR-17],25737,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2019-12-05\25737
|
|
||||||
tiny,BW2-10 [AMR-22],61950,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-12-10\61950
|
|
||||||
tiny,BW1-6 [AMR-19],93265,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-11-13\93265
|
|
||||||
tiny,BW1-6 [AMR-19],113849,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-09-02\113849
|
|
||||||
tiny,BW2-11 [AMR-23],124373,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-02-21\124373
|
|
||||||
tiny,BW2-13 [AMR-24],120371,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-11-29\120371
|
|
||||||
tiny,BW1-6 [AMR-19],87277,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-09-14\87277
|
|
||||||
tiny,BW2-11 [AMR-23],122855,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-02-03\122855
|
|
||||||
tiny,BW1-6 [AMR-19],69086,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-02-21\69086
|
|
||||||
tiny,BW3-19 [AMR-21],47993,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-07-15\47993
|
|
||||||
tiny,BW2-13 [AMR-24],125103,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-03-02\125103
|
|
||||||
tiny,BW3-21 [AMR-17],103344,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2022-03-25\103344
|
|
||||||
tiny,BW3-19 [AMR-21],57723,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-10-23\57723
|
|
||||||
tiny,BW2-8 [AMR-25],79195,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2021-06-06\79195
|
|
||||||
tiny,BW3-19 [AMR-21],54692,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-09-19\54692
|
|
||||||
tiny,BW3-16 [AMR-16],30599,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-01-19\30599
|
|
||||||
tiny,BW2-11 [AMR-23],130942,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-05-19\130942
|
|
||||||
tiny,BW2-13 [AMR-24],138601,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-09-07\138601
|
|
||||||
tiny,BW1-6 [AMR-19],92258,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-11-03\92258
|
|
||||||
tiny,BW2-8 [AMR-25],23181,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-11-13\23181
|
|
||||||
tiny,BW3-21 [AMR-17],53547,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-09-09\53547
|
|
||||||
tiny,BW2-13 [AMR-24],155307,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2024-04-28\155307
|
|
||||||
tiny,BW2-8 [AMR-25],72356,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2021-03-27\72356
|
|
||||||
tiny,BW3-21 [AMR-17],95618,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2021-12-16\95618
|
|
||||||
tiny,BW3-19 [AMR-21],48393,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-07-18\48393
|
|
||||||
tiny,BW2-13 [AMR-24],130075,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-05-04\130075
|
|
||||||
tiny,BW3-21 [AMR-17],39758,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-04-14\39758
|
|
||||||
tiny,BW2-11 [AMR-23],126894,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-03-23\126894
|
|
||||||
tiny,BW2-13 [AMR-24],82264,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2021-07-07\82264
|
|
||||||
tiny,BW1-6 [AMR-19],99228,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-02-03\99228
|
|
||||||
tiny,BW2-11 [AMR-23],124000,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-02-17\124000
|
|
||||||
tiny,BW1-4 [AMR-15],46063,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-06-18\46063
|
|
||||||
tiny,BW2-13 [AMR-24],93211,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2021-11-13\93211
|
|
||||||
tiny,BW3-20 [AMR-26],87312,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2021-09-14\87312
|
|
||||||
tiny,BW2-13 [AMR-24],131348,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2023-05-25\131348
|
|
||||||
tiny,BW1-6 [AMR-19],94711,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-12-03\94711
|
|
||||||
tiny,BW2-11 [AMR-23],129519,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-04-23\129519
|
|
||||||
tiny,BW3-21 [AMR-17],32767,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2020-02-08\32767
|
|
||||||
tiny,BW2-13 [AMR-24],93571,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2021-11-19\93571
|
|
||||||
small,BW2-11 [AMR-23],158199,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2024-07-21\158199
|
|
||||||
small,BW3-19 [AMR-21],96770,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2021-12-31\96770
|
|
||||||
small,BW2-13 [AMR-24],47488,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-07-09\47488
|
|
||||||
small,BW3-19 [AMR-21],152767,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2024-03-21\152767
|
|
||||||
small,BW2-10 [AMR-22],129800,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-04-27\129800
|
|
||||||
small,BW2-11 [AMR-23],114702,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2022-09-15\114702
|
|
||||||
small,BW2-10 [AMR-22],135212,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-07-21\135212
|
|
||||||
small,BW2-11 [AMR-23],136572,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-08-09\136572
|
|
||||||
small,BW3-20 [AMR-26],145231,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2023-12-03\145231
|
|
||||||
small,BW3-19 [AMR-21],32001,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-02-01\32001
|
|
||||||
small,BW2-11 [AMR-23],116124,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2022-10-01\116124
|
|
||||||
small,BW3-20 [AMR-26],120928,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-12-10\120928
|
|
||||||
small,BW3-16 [AMR-16],56581,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-10-08\56581
|
|
||||||
small,BW3-20 [AMR-26],123441,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2023-02-10\123441
|
|
||||||
small,BW2-13 [AMR-24],41468,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-04-29\41468
|
|
||||||
small,BW2-11 [AMR-23],19698,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2019-10-13\19698
|
|
||||||
small,BW2-11 [AMR-23],154592,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2024-04-18\154592
|
|
||||||
small,BW2-10 [AMR-22],137156,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-08-16\137156
|
|
||||||
small,BW3-19 [AMR-21],85449,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2021-08-20\85449
|
|
||||||
small,BW3-19 [AMR-21],102824,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2022-03-19\102824
|
|
||||||
small,BW1-6 [AMR-19],54986,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2020-09-22\54986
|
|
||||||
small,BW1-6 [AMR-19],135364,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2023-07-23\135364
|
|
||||||
small,BW1-6 [AMR-19],28609,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2020-01-01\28609
|
|
||||||
small,BW2-10 [AMR-22],115991,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-09-29\115991
|
|
||||||
small,BW3-20 [AMR-26],28596,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-01-01\28596
|
|
||||||
small,BW2-10 [AMR-22],106310,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-04-28\106310
|
|
||||||
small,BW3-16 [AMR-16],65871,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2021-01-19\65871
|
|
||||||
small,BW3-20 [AMR-26],103751,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-03-29\103751
|
|
||||||
small,BW1-6 [AMR-19],118031,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-10-26\118031
|
|
||||||
small,BW2-13 [AMR-24],112247,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-07-20\112247
|
|
||||||
small,BW2-13 [AMR-24],118274,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-10-28\118274
|
|
||||||
small,BW3-20 [AMR-26],104298,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-04-03\104298
|
|
||||||
small,BW3-19 [AMR-21],130200,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2023-05-09\130200
|
|
||||||
small,BW3-19 [AMR-21],59385,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-11-15\59385
|
|
||||||
small,BW2-11 [AMR-23],132767,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-06-14\132767
|
|
||||||
small,BW3-20 [AMR-26],152753,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-03-21\152753
|
|
||||||
small,BW1-4 [AMR-15],31573,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-01-28\31573
|
|
||||||
small,BW1-6 [AMR-19],21993,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2019-11-03\21993
|
|
||||||
small,BW3-19 [AMR-21],34801,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-02-27\34801
|
|
||||||
small,BW2-11 [AMR-23],108563,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2022-05-22\108563
|
|
||||||
small,BW3-21 [AMR-17],15863,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2019-09-08\15863
|
|
||||||
small,BW2-11 [AMR-23],38719,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2020-04-03\38719
|
|
||||||
small,BW1-6 [AMR-19],26196,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2019-12-10\26196
|
|
||||||
small,BW2-11 [AMR-23],90722,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2021-10-19\90722
|
|
||||||
small,BW3-16 [AMR-16],47187,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-07-04\47187
|
|
||||||
small,BW2-10 [AMR-22],110531,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-06-16\110531
|
|
||||||
small,BW3-16 [AMR-16],11297,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2015-11-01\11297
|
|
||||||
small,BW1-4 [AMR-15],43503,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-05-17\43503
|
|
||||||
small,BW2-11 [AMR-23],115701,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2022-09-25\115701
|
|
||||||
small,BW3-19 [AMR-21],95504,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2021-12-14\95504
|
|
||||||
medium,BW2-10 [AMR-22],86562,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-09-04\86562
|
|
||||||
medium,BW3-20 [AMR-26],38929,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-04-05\38929
|
|
||||||
medium,BW2-10 [AMR-22],125087,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-03-02\125087
|
|
||||||
medium,BW2-13 [AMR-24],119980,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2022-11-20\119980
|
|
||||||
medium,BW2-10 [AMR-22],74116,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-04-14\74116
|
|
||||||
medium,BW2-10 [AMR-22],101557,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-03-05\101557
|
|
||||||
medium,BW3-20 [AMR-26],148093,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-01-13\148093
|
|
||||||
medium,BW2-10 [AMR-22],97238,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-01-07\97238
|
|
||||||
medium,BW3-20 [AMR-26],23171,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-11-13\23171
|
|
||||||
medium,BW3-20 [AMR-26],28007,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-12-26\28007
|
|
||||||
medium,BW1-4 [AMR-15],52288,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-08-29\52288
|
|
||||||
medium,BW3-16 [AMR-16],66638,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2021-01-28\66638
|
|
||||||
medium,BW3-20 [AMR-26],54374,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-09-16\54374
|
|
||||||
medium,BW2-8 [AMR-25],158079,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2024-07-16\158079
|
|
||||||
medium,BW2-10 [AMR-22],122216,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-01-18\122216
|
|
||||||
medium,BW3-20 [AMR-26],151922,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-03-08\151922
|
|
||||||
medium,BW2-13 [AMR-24],47678,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-07-11\47678
|
|
||||||
medium,BW2-10 [AMR-22],32062,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-02-01\32062
|
|
||||||
medium,BW2-8 [AMR-25],60826,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2020-11-29\60826
|
|
||||||
medium,BW2-10 [AMR-22],31095,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-01-24\31095
|
|
||||||
medium,BW2-10 [AMR-22],144344,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-11-22\144344
|
|
||||||
medium,BW2-10 [AMR-22],140013,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2023-09-26\140013
|
|
||||||
medium,BW3-20 [AMR-26],55608,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-09-27\55608
|
|
||||||
medium,BW2-8 [AMR-25],17697,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-09-25\17697
|
|
||||||
medium,BW3-20 [AMR-26],26794,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-12-14\26794
|
|
||||||
medium,BW2-10 [AMR-22],114464,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-09-11\114464
|
|
||||||
medium,BW2-10 [AMR-22],113595,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-08-26\113595
|
|
||||||
medium,BW3-20 [AMR-26],59494,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-11-17\59494
|
|
||||||
medium,BW3-20 [AMR-26],17595,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-09-24\17595
|
|
||||||
medium,BW2-10 [AMR-22],95535,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-12-15\95535
|
|
||||||
medium,BW2-11 [AMR-23],159024,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2024-11-14\159024
|
|
||||||
medium,BW3-20 [AMR-26],29326,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-01-08\29326
|
|
||||||
medium,BW3-20 [AMR-26],129738,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2023-04-27\129738
|
|
||||||
medium,BW2-10 [AMR-22],49731,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-08-01\49731
|
|
||||||
medium,BW3-20 [AMR-26],23196,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-11-14\23196
|
|
||||||
medium,BW2-10 [AMR-22],72647,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-03-30\72647
|
|
||||||
medium,BW2-13 [AMR-24],39157,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-04-08\39157
|
|
||||||
medium,BW3-20 [AMR-26],138785,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2023-09-09\138785
|
|
||||||
medium,BW3-20 [AMR-26],148250,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-01-16\148250
|
|
||||||
medium,BW3-20 [AMR-26],119471,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-11-12\119471
|
|
||||||
medium,BW3-20 [AMR-26],34470,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-02-23\34470
|
|
||||||
medium,BW3-20 [AMR-26],109734,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2022-06-07\109734
|
|
||||||
medium,BW2-10 [AMR-22],116997,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-10-13\116997
|
|
||||||
medium,BW2-10 [AMR-22],26076,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2019-12-08\26076
|
|
||||||
medium,BW2-10 [AMR-22],42501,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-05-08\42501
|
|
||||||
medium,BW2-8 [AMR-25],52036,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2020-08-27\52036
|
|
||||||
medium,BW3-16 [AMR-16],37365,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-03-22\37365
|
|
||||||
medium,BW2-8 [AMR-25],157670,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2024-06-25\157670
|
|
||||||
medium,BW3-20 [AMR-26],15419,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-09-04\15419
|
|
||||||
medium,BW2-10 [AMR-22],38651,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2020-04-03\38651
|
|
||||||
large,BW3-20 [AMR-26],63054,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-12-21\63054
|
|
||||||
large,BW2-10 [AMR-22],12990,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2018-10-15\12990
|
|
||||||
large,BW1-4 [AMR-15],71109,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2021-03-15\71109
|
|
||||||
large,BW1-4 [AMR-15],10715,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-05-01\10715
|
|
||||||
large,BW3-21 [AMR-17],12185,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2017-03-27\12185
|
|
||||||
large,BW1-4 [AMR-15],10907,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-07\10907
|
|
||||||
large,BW2-10 [AMR-22],12693,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2018-03-12\12693
|
|
||||||
large,BW1-4 [AMR-15],10898,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-05\10898
|
|
||||||
large,BW1-4 [AMR-15],49214,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-07-27\49214
|
|
||||||
large,BW2-11 [AMR-23],12552,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2017-12-04\12552
|
|
||||||
large,BW3-17 [AMR-20],10937,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2015-06-18\10937
|
|
||||||
large,BW2-10 [AMR-22],12353,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2017-08-11\12353
|
|
||||||
large,BW2-11 [AMR-23],142004,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2023-10-23\142004
|
|
||||||
large,BW3-17 [AMR-20],10100,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2014-06-16\10100
|
|
||||||
large,BW3-17 [AMR-20],89168,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2021-10-04\89168
|
|
||||||
large,BW2-8 [AMR-25],10377,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2014-11-24\10377
|
|
||||||
large,BW3-19 [AMR-21],13055,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2018-11-19\13055
|
|
||||||
large,BW1-6 [AMR-19],10620,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2015-03-25\10620
|
|
||||||
large,BW3-20 [AMR-26],75333,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2021-04-26\75333
|
|
||||||
large,BW3-20 [AMR-26],71107,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2021-03-15\71107
|
|
||||||
large,BW3-17 [AMR-20],157907,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2024-07-05\157907
|
|
||||||
large,BW2-10 [AMR-22],10925,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2015-06-15\10925
|
|
||||||
large,BW2-13 [AMR-24],13017,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2018-10-30\13017
|
|
||||||
large,BW2-8 [AMR-25],152547,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2024-03-18\152547
|
|
||||||
large,BW1-6 [AMR-19],13004,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2018-10-21\13004
|
|
||||||
large,BW1-6 [AMR-19],12934,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2018-08-20\12934
|
|
||||||
large,BW2-13 [AMR-24],150086,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2024-02-12\150086
|
|
||||||
large,BW3-16 [AMR-16],29192,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2020-01-06\29192
|
|
||||||
large,BW2-13 [AMR-24],150620,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2024-02-19\150620
|
|
||||||
large,BW2-13 [AMR-24],10137,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2014-07-07\10137
|
|
||||||
large,BW2-13 [AMR-24],12969,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2018-09-10\12969
|
|
||||||
large,BW3-16 [AMR-16],10129,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2014-06-30\10129
|
|
||||||
large,BW1-4 [AMR-15],10930,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-16\10930
|
|
||||||
large,BW1-4 [AMR-15],60897,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-11-30\60897
|
|
||||||
large,BW3-16 [AMR-16],13042,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2018-11-12\13042
|
|
||||||
large,BW1-4 [AMR-15],54939,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-09-21\54939
|
|
||||||
large,BW1-6 [AMR-19],12922,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2018-08-13\12922
|
|
||||||
large,BW1-4 [AMR-15],10905,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-06\10905
|
|
||||||
large,BW2-13 [AMR-24],13104,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2018-12-31\13104
|
|
||||||
large,BW2-11 [AMR-23],10177,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2014-07-24\10177
|
|
||||||
large,BW1-6 [AMR-19],12492,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2017-10-23\12492
|
|
||||||
large,BW2-10 [AMR-22],10647,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2015-03-30\10647
|
|
||||||
large,BW2-8 [AMR-25],65492,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2021-01-14\65492
|
|
||||||
large,BW3-19 [AMR-21],13259,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2019-04-08\13259
|
|
||||||
large,BW3-16 [AMR-16],13105,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2018-12-31\13105
|
|
||||||
large,BW1-6 [AMR-19],10002,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2014-02-10\10002
|
|
||||||
large,BW2-13 [AMR-24],10176,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2014-07-24\10176
|
|
||||||
large,BW1-7 [AMR-18],10312,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2014-10-27\10312
|
|
||||||
large,BW3-16 [AMR-16],11143,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2015-08-04\11143
|
|
||||||
large,BW2-10 [AMR-22],10302,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2014-10-10\10302
|
|
||||||
xlarge,BW1-6 [AMR-19],157995,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2024-07-12\157995
|
|
||||||
xlarge,BW2-13 [AMR-24],157337,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2024-06-10\157337
|
|
||||||
xlarge,BW3-21 [AMR-17],12676,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2018-02-26\12676
|
|
||||||
xlarge,BW3-16 [AMR-16],10666,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2015-04-14\10666
|
|
||||||
xlarge,BW1-6 [AMR-19],74657,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2021-04-19\74657
|
|
||||||
xlarge,BW1-4 [AMR-15],10921,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-15\10921
|
|
||||||
xlarge,BW3-19 [AMR-21],43555,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2020-05-18\43555
|
|
||||||
xlarge,BW3-16 [AMR-16],11988,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2016-12-07\11988
|
|
||||||
xlarge,BW3-21 [AMR-17],12906,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-21__AMR-17\2018-07-17\12906
|
|
||||||
xlarge,BW1-4 [AMR-15],13280,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2019-04-22\13280
|
|
||||||
xlarge,BW1-6 [AMR-19],111563,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-6__AMR-19\2022-07-04\111563
|
|
||||||
xlarge,BW3-20 [AMR-26],12941,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2018-08-20\12941
|
|
||||||
xlarge,BW2-8 [AMR-25],13126,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2019-01-14\13126
|
|
||||||
xlarge,BW1-7 [AMR-18],112645,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2022-07-29\112645
|
|
||||||
xlarge,BW2-11 [AMR-23],12581,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2017-12-27\12581
|
|
||||||
xlarge,BW2-13 [AMR-24],12034,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2017-01-03\12034
|
|
||||||
xlarge,BW2-13 [AMR-24],12260,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2017-06-05\12260
|
|
||||||
xlarge,BW1-7 [AMR-18],10065,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2014-05-05\10065
|
|
||||||
xlarge,BW2-11 [AMR-23],13229,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2019-03-25\13229
|
|
||||||
xlarge,BW1-4 [AMR-15],10196,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2014-08-04\10196
|
|
||||||
xlarge,BW1-7 [AMR-18],122844,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2023-02-03\122844
|
|
||||||
xlarge,BW2-11 [AMR-23],83433,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2021-07-19\83433
|
|
||||||
xlarge,BW1-4 [AMR-15],43558,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2020-05-18\43558
|
|
||||||
xlarge,BW2-11 [AMR-23],38997,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2020-04-06\38997
|
|
||||||
xlarge,BW2-8 [AMR-25],10325,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-8__AMR-25\2014-11-03\10325
|
|
||||||
xlarge,BW3-20 [AMR-26],10356,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2014-11-17\10356
|
|
||||||
xlarge,BW3-20 [AMR-26],10306,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2014-10-10\10306
|
|
||||||
xlarge,BW2-13 [AMR-24],47870,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2020-07-13\47870
|
|
||||||
xlarge,BW2-10 [AMR-22],113242,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2022-08-15\113242
|
|
||||||
xlarge,BW2-11 [AMR-23],11477,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-11__AMR-23\2016-02-15\11477
|
|
||||||
xlarge,BW3-19 [AMR-21],11185,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2015-08-24\11185
|
|
||||||
xlarge,BW3-20 [AMR-26],62336,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-12-14\62336
|
|
||||||
xlarge,BW3-20 [AMR-26],10454,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2015-01-05\10454
|
|
||||||
xlarge,BW3-16 [AMR-16],10329,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2014-11-03\10329
|
|
||||||
xlarge,BW3-19 [AMR-21],13342,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-19__AMR-21\2019-05-28\13342
|
|
||||||
xlarge,BW3-20 [AMR-26],148596,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2024-01-22\148596
|
|
||||||
xlarge,BW2-13 [AMR-24],11987,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-13__AMR-24\2016-12-07\11987
|
|
||||||
xlarge,BW1-7 [AMR-18],157743,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2024-06-28\157743
|
|
||||||
xlarge,BW1-7 [AMR-18],11852,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2016-08-30\11852
|
|
||||||
xlarge,BW2-10 [AMR-22],85215,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2021-08-16\85215
|
|
||||||
xlarge,BW1-7 [AMR-18],8572,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-7__AMR-18\2014-01-06\8572
|
|
||||||
xlarge,BW1-4 [AMR-15],10206,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2014-08-11\10206
|
|
||||||
xlarge,BW3-17 [AMR-20],13191,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-17__AMR-20\2019-02-25\13191
|
|
||||||
xlarge,BW3-20 [AMR-26],42786,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2020-05-11\42786
|
|
||||||
xlarge,BW3-16 [AMR-16],11901,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-16__AMR-16\2016-10-03\11901
|
|
||||||
xlarge,BW1-4 [AMR-15],10073,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2014-05-19\10073
|
|
||||||
xlarge,BW3-20 [AMR-26],13278,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW3-20__AMR-26\2019-04-16\13278
|
|
||||||
xlarge,BW2-10 [AMR-22],19711,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW2-10__AMR-22\2019-10-14\19711
|
|
||||||
xlarge,BW1-4 [AMR-15],10256,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2014-09-15\10256
|
|
||||||
xlarge,BW1-4 [AMR-15],11035,\\192.168.1.192\projects\Code\spruce_scraper\archives\BW1-4__AMR-15\2015-06-29\11035
|
|
||||||
|
@@ -2,11 +2,13 @@
|
|||||||
"""
|
"""
|
||||||
Report mosaic download progress from archives/scans.csv.
|
Report mosaic download progress from archives/scans.csv.
|
||||||
|
|
||||||
Output is formatted as Markdown. Add --by-year for a per-machine ×
|
Output is Markdown. Use ``--by-year`` for a per-machine × per-year
|
||||||
per-year breakdown table.
|
done/failed table. When the first mosaic pass is complete (no pending rows)
|
||||||
|
but failures remain, a **Mosaic retry estimates** section is printed with
|
||||||
|
queue counts and duration hints.
|
||||||
|
|
||||||
Rate/ETA require two calls at least 60 s apart. Mean mosaic size is
|
Rate/ETA use a 30-minute rolling window when snapshots show progress.
|
||||||
sampled from up to 100 already-downloaded files and cached for 1 hour.
|
Mean mosaic size is sampled from up to 100 downloads (1-hour cache).
|
||||||
|
|
||||||
Usage:
|
Usage:
|
||||||
python scripts/mosaic_progress_report.py [--archive DIR] [--by-year]
|
python scripts/mosaic_progress_report.py [--archive DIR] [--by-year]
|
||||||
@@ -30,6 +32,11 @@ _R_PRE19 = 1.00
|
|||||||
_R_PURGED = 0.00
|
_R_PURGED = 0.00
|
||||||
_R_RECENT = 0.82
|
_R_RECENT = 0.82
|
||||||
|
|
||||||
|
FIRST_PASS_FALLBACK_RATE_PER_HR = 1100.0
|
||||||
|
RETRY_OPTIMISTIC_RATE_PER_HR = 1800.0
|
||||||
|
RETRY_REALISTIC_RATE_PER_HR = 1100.0
|
||||||
|
RETRY_PESSIMISTIC_RATE_PER_HR = 300.0
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Helpers
|
# Helpers
|
||||||
@@ -127,6 +134,12 @@ def _expected_remaining(pending_rows: list[dict]) -> float:
|
|||||||
return count
|
return count
|
||||||
|
|
||||||
|
|
||||||
|
def _retry_hours_from_rate(n_scans: int, rate_per_hr: float) -> str:
|
||||||
|
if n_scans <= 0 or rate_per_hr <= 0:
|
||||||
|
return "—"
|
||||||
|
return _fmt_duration(n_scans / rate_per_hr * 3600.0)
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Main
|
# Main
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
@@ -213,15 +226,32 @@ def main() -> None:
|
|||||||
|
|
||||||
rate_per_sec: float | None = None
|
rate_per_sec: float | None = None
|
||||||
rate_window_str = ""
|
rate_window_str = ""
|
||||||
|
snap_delta_proc = 0
|
||||||
if recent:
|
if recent:
|
||||||
oldest = recent[0]
|
oldest = recent[0]
|
||||||
dt = now.timestamp() - oldest["ts"]
|
dt = now.timestamp() - oldest["ts"]
|
||||||
dp = processed - oldest["proc"]
|
dp = processed - oldest["proc"]
|
||||||
|
snap_delta_proc = dp
|
||||||
if dt >= 60 and dp > 0:
|
if dt >= 60 and dp > 0:
|
||||||
rate_per_sec = dp / dt
|
rate_per_sec = dp / dt
|
||||||
window_min = dt / 60
|
window_min = dt / 60
|
||||||
rate_window_str = f"{window_min:.0f}-min avg"
|
rate_window_str = f"{window_min:.0f}-min avg"
|
||||||
|
|
||||||
|
# One-time baseline after initial mosaic crawl finished (no pending rows).
|
||||||
|
if pending == 0 and "first_pass_mean_rate_per_hr" not in cache:
|
||||||
|
cache["first_pass_completed_at"] = now.isoformat()
|
||||||
|
cache["first_pass_processed"] = total
|
||||||
|
cache["first_pass_mean_rate_per_hr"] = FIRST_PASS_FALLBACK_RATE_PER_HR
|
||||||
|
|
||||||
|
first_pass_rate_hr = float(
|
||||||
|
cache.get("first_pass_mean_rate_per_hr", FIRST_PASS_FALLBACK_RATE_PER_HR)
|
||||||
|
)
|
||||||
|
live_rate_hr = rate_per_sec * 3600 if rate_per_sec else None
|
||||||
|
# Active scrape shows progress in snapshots; idle archive shows dp == 0.
|
||||||
|
retry_estimate_rate_hr = (
|
||||||
|
live_rate_hr if live_rate_hr is not None else first_pass_rate_hr
|
||||||
|
)
|
||||||
|
|
||||||
# --- Disk space ---
|
# --- Disk space ---
|
||||||
mean_bytes: float | None = None
|
mean_bytes: float | None = None
|
||||||
size_note = ""
|
size_note = ""
|
||||||
@@ -325,6 +355,76 @@ def main() -> None:
|
|||||||
align=["l", "r", "r", "r", "r", "r"],
|
align=["l", "r", "r", "r", "r", "r"],
|
||||||
))
|
))
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------
|
||||||
|
# Retry estimates (first pass complete: pending == 0, failures remain)
|
||||||
|
# -----------------------------------------------------------------------
|
||||||
|
if failed > 0 and pending == 0:
|
||||||
|
failed_rows_list = [
|
||||||
|
r for r in latest.values()
|
||||||
|
if r.get("mosaic_download_status") == "failed"
|
||||||
|
]
|
||||||
|
n_all = len(failed_rows_list)
|
||||||
|
n_2023 = sum(
|
||||||
|
1 for r in failed_rows_list
|
||||||
|
if (r.get("scan_time") or "")[:4] >= "2023"
|
||||||
|
and len((r.get("scan_time") or "")[:4]) == 4
|
||||||
|
)
|
||||||
|
n_200 = sum(
|
||||||
|
1 for r in failed_rows_list
|
||||||
|
if r.get("mosaic_error_code") == "200"
|
||||||
|
)
|
||||||
|
rate_note = (
|
||||||
|
"rolling 30-min window"
|
||||||
|
if snap_delta_proc > 0
|
||||||
|
else f"first-pass baseline ({first_pass_rate_hr:,.0f}/hr)"
|
||||||
|
)
|
||||||
|
print()
|
||||||
|
print("### Mosaic retry estimates\n")
|
||||||
|
print(
|
||||||
|
f"*Suggested command after server fix:* "
|
||||||
|
f"`python scraper.py --retry-failed --workers 2` "
|
||||||
|
f"(filters: `--retry-since YEAR`, `--retry-error-code CODE`)*\n"
|
||||||
|
)
|
||||||
|
print(
|
||||||
|
f"*ETA column uses **{retry_estimate_rate_hr:,.0f} scans/hr** "
|
||||||
|
f"({rate_note}). Fixed columns use scenario rates.*\n"
|
||||||
|
)
|
||||||
|
est_hdr = (
|
||||||
|
"Retry scope",
|
||||||
|
"Count",
|
||||||
|
f"@{RETRY_OPTIMISTIC_RATE_PER_HR:.0f}/hr",
|
||||||
|
f"@{RETRY_REALISTIC_RATE_PER_HR:.0f}/hr",
|
||||||
|
f"@{RETRY_PESSIMISTIC_RATE_PER_HR:.0f}/hr",
|
||||||
|
f"@{retry_estimate_rate_hr:.0f}/hr",
|
||||||
|
)
|
||||||
|
retry_tbl_rows = [
|
||||||
|
[
|
||||||
|
"HTTP 200 (empty body)",
|
||||||
|
f"{n_200:,}",
|
||||||
|
_retry_hours_from_rate(n_200, RETRY_OPTIMISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_200, RETRY_REALISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_200, RETRY_PESSIMISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_200, retry_estimate_rate_hr),
|
||||||
|
],
|
||||||
|
[
|
||||||
|
"Failed, scan_time ≥ 2023",
|
||||||
|
f"{n_2023:,}",
|
||||||
|
_retry_hours_from_rate(n_2023, RETRY_OPTIMISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_2023, RETRY_REALISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_2023, RETRY_PESSIMISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_2023, retry_estimate_rate_hr),
|
||||||
|
],
|
||||||
|
[
|
||||||
|
"**All failed**",
|
||||||
|
f"**{n_all:,}**",
|
||||||
|
_retry_hours_from_rate(n_all, RETRY_OPTIMISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_all, RETRY_REALISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_all, RETRY_PESSIMISTIC_RATE_PER_HR),
|
||||||
|
_retry_hours_from_rate(n_all, retry_estimate_rate_hr),
|
||||||
|
],
|
||||||
|
]
|
||||||
|
print(_md_table(list(est_hdr), retry_tbl_rows, align=["l", "r", "r", "r", "r", "r"]))
|
||||||
|
|
||||||
# -----------------------------------------------------------------------
|
# -----------------------------------------------------------------------
|
||||||
# --by-year table
|
# --by-year table
|
||||||
# -----------------------------------------------------------------------
|
# -----------------------------------------------------------------------
|
||||||
|
|||||||
@@ -1,178 +0,0 @@
|
|||||||
#!/usr/bin/env bash
|
|
||||||
# For each machine label in a text file, pick one random completed scan and download
|
|
||||||
# it: by default the mosaic and all tiles (same as: --machine "…" --scan-id N).
|
|
||||||
# For mosaic only (faster, no tile downloads), set: MOSAIC_ONLY=1
|
|
||||||
#
|
|
||||||
# Usage:
|
|
||||||
# ./scripts/sample_random_scans.sh [PATH_TO_machines.txt]
|
|
||||||
# Config path defaults to config.yaml in the repo root. Override with:
|
|
||||||
# CONFIG=/path/to/config.yaml ./scripts/sample_random_scans.sh machines.txt
|
|
||||||
# Dry-run the download step (listing still does real HTTP to fetch scan list):
|
|
||||||
# DRY_RUN=1 ./scripts/sample_random_scans.sh machines.txt
|
|
||||||
# Verbose / debug (extra per-step lines, scan counts from the list step):
|
|
||||||
# DEBUG=1 ./scripts/sample_random_scans.sh machines.txt
|
|
||||||
# By default, --list-scans fetches only the first page (one HTTP request, up to
|
|
||||||
# 320 scans). To paginate the full archive for the random pick (slower when many
|
|
||||||
# LIST_SCANS_ALL_PAGES=1 ./scripts/sample_random_scans.sh machines.txt
|
|
||||||
#
|
|
||||||
# machines.txt: one machine label per line (same as --machine and config machine names).
|
|
||||||
# See scripts/machines.example.txt
|
|
||||||
|
|
||||||
set -euo pipefail
|
|
||||||
|
|
||||||
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
|
|
||||||
CONFIG="${CONFIG:-$REPO_ROOT/config.yaml}"
|
|
||||||
MACHINES_FILE="${1:-$REPO_ROOT/machines.txt}"
|
|
||||||
SCRAPER=(python3 "$REPO_ROOT/scraper.py" --config "$CONFIG")
|
|
||||||
|
|
||||||
log() { echo "[sample_random_scans] $*" >&2; }
|
|
||||||
log_debug() {
|
|
||||||
if [[ -n "${DEBUG:-}" ]]; then
|
|
||||||
echo "[sample_random_scans] debug: $*" >&2
|
|
||||||
fi
|
|
||||||
}
|
|
||||||
|
|
||||||
if [[ ! -f "$MACHINES_FILE" ]]; then
|
|
||||||
log "error: file not found: $MACHINES_FILE"
|
|
||||||
log "Create it with one machine label per line, or: cp scripts/machines.example.txt machines.txt"
|
|
||||||
exit 1
|
|
||||||
fi
|
|
||||||
|
|
||||||
if [[ ! -f "$CONFIG" ]]; then
|
|
||||||
log "error: config not found: $CONFIG"
|
|
||||||
exit 1
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Non-empty, non-comment lines (same rules as the main loop)
|
|
||||||
TOTAL_MACHINES="$(
|
|
||||||
grep -v '^[[:space:]]*#' "$MACHINES_FILE" | grep -c -v '^[[:space:]]*$' || true
|
|
||||||
)"
|
|
||||||
if [[ -z "$TOTAL_MACHINES" || "$TOTAL_MACHINES" -eq 0 ]]; then
|
|
||||||
log "error: no machine lines in: $MACHINES_FILE"
|
|
||||||
exit 1
|
|
||||||
fi
|
|
||||||
|
|
||||||
log "starting repo=$REPO_ROOT"
|
|
||||||
log " config=$CONFIG"
|
|
||||||
log " machines_file=$MACHINES_FILE (${TOTAL_MACHINES} machine(s) in file)"
|
|
||||||
if [[ -n "${MOSAIC_ONLY:-}" ]]; then
|
|
||||||
if [[ -n "${DRY_RUN:-}" ]]; then
|
|
||||||
log " mode: MOSAIC_ONLY + DRY_RUN (mosaic only, --dry-run on download step)"
|
|
||||||
else
|
|
||||||
log " mode: MOSAIC_ONLY=1 (mosaics only, no tiles; use for a lighter sample)"
|
|
||||||
fi
|
|
||||||
else
|
|
||||||
if [[ -n "${DRY_RUN:-}" ]]; then
|
|
||||||
log " mode: DRY_RUN (list + full scan download use --dry-run; no files written)"
|
|
||||||
else
|
|
||||||
log " mode: full scan — mosaic + all tiles (workers from config)"
|
|
||||||
fi
|
|
||||||
fi
|
|
||||||
if [[ -n "${DEBUG:-}" ]]; then
|
|
||||||
log " DEBUG=1 (extra diagnostics enabled)"
|
|
||||||
fi
|
|
||||||
if [[ -n "${LIST_SCANS_ALL_PAGES:-}" ]]; then
|
|
||||||
log " list step: list-scans = full archive (all pages, slower)"
|
|
||||||
else
|
|
||||||
log " list step: list-scans --list-scans-first-page-only (one page, up to 320 IDs)"
|
|
||||||
fi
|
|
||||||
log "────────────────────────────────────────"
|
|
||||||
|
|
||||||
export REPO_ROOT CONFIG
|
|
||||||
[[ -n "${DEBUG:-}" ]] && export DEBUG
|
|
||||||
[[ -n "${LIST_SCANS_ALL_PAGES:-}" ]] && export LIST_SCANS_ALL_PAGES
|
|
||||||
|
|
||||||
PROCESSED=0
|
|
||||||
SKIPPED=0
|
|
||||||
IDX=0
|
|
||||||
|
|
||||||
while IFS= read -r line || [[ -n "${line-}" ]]; do
|
|
||||||
# trim, strip CR, skip blanks / comments
|
|
||||||
line="${line//$'\r'/}"
|
|
||||||
label="${line#"${line%%[![:space:]]*}"}"
|
|
||||||
label="${label%"${label##*[![:space:]]}"}"
|
|
||||||
[[ -z "$label" || "$label" == \#* ]] && continue
|
|
||||||
|
|
||||||
IDX=$((IDX + 1))
|
|
||||||
log "[$IDX/$TOTAL_MACHINES] machine: $label"
|
|
||||||
log " status: listing scans (--list-scans) …"
|
|
||||||
|
|
||||||
random_id="$(
|
|
||||||
REPO_ROOT="$REPO_ROOT" CONFIG="$CONFIG" LABEL="$label" python3 - <<'PY'
|
|
||||||
import os, random, subprocess, sys
|
|
||||||
|
|
||||||
label = os.environ["LABEL"]
|
|
||||||
repo = os.environ["REPO_ROOT"]
|
|
||||||
cfg = os.environ["CONFIG"]
|
|
||||||
debug = bool(os.environ.get("DEBUG"))
|
|
||||||
full = bool(os.environ.get("LIST_SCANS_ALL_PAGES"))
|
|
||||||
scraper = os.path.join(repo, "scraper.py")
|
|
||||||
if debug:
|
|
||||||
print(
|
|
||||||
f"[sample_random_scans] debug: running list-scans for {label!r} "
|
|
||||||
f"({'all pages' if full else 'first page only'})",
|
|
||||||
file=sys.stderr,
|
|
||||||
)
|
|
||||||
cmd = [sys.executable, scraper, "--list-scans", "--machine", label, "--config", cfg]
|
|
||||||
if not full:
|
|
||||||
cmd.insert(3, "--list-scans-first-page-only")
|
|
||||||
out = subprocess.check_output(
|
|
||||||
cmd,
|
|
||||||
text=True,
|
|
||||||
stderr=subprocess.STDOUT,
|
|
||||||
)
|
|
||||||
ids = []
|
|
||||||
for line in out.splitlines():
|
|
||||||
line = line.rstrip()
|
|
||||||
if not line or line.startswith("---") or "Total" in line:
|
|
||||||
continue
|
|
||||||
parts = line.split()
|
|
||||||
if parts and parts[0].isdigit():
|
|
||||||
ids.append(parts[0])
|
|
||||||
if not ids:
|
|
||||||
print(f"no scans parsed for {label!r} — check login and output", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
if debug:
|
|
||||||
print(
|
|
||||||
f"[sample_random_scans] debug: parsed {len(ids)} scan id(s) for {label!r}",
|
|
||||||
file=sys.stderr,
|
|
||||||
)
|
|
||||||
print(random.choice(ids), end="")
|
|
||||||
PY
|
|
||||||
)" || {
|
|
||||||
log " status: SKIPPED (could not get scan list or pick id)"
|
|
||||||
SKIPPED=$((SKIPPED + 1))
|
|
||||||
continue
|
|
||||||
}
|
|
||||||
|
|
||||||
log " status: picked random scan_id=$random_id (uniform among IDs from this list step — first page by default, see start banner)"
|
|
||||||
if [[ -n "${MOSAIC_ONLY:-}" ]]; then
|
|
||||||
log " status: running scraper: --mosaic-only --scan-id (mosaic only) …"
|
|
||||||
else
|
|
||||||
log " status: running scraper: --scan-id (mosaic + tiles) …"
|
|
||||||
fi
|
|
||||||
if [[ -n "${DRY_RUN:-}" ]]; then
|
|
||||||
log " status: (dry-run — no files written for this scan)"
|
|
||||||
fi
|
|
||||||
|
|
||||||
if [[ -n "${MOSAIC_ONLY:-}" ]]; then
|
|
||||||
run_cmd=("${SCRAPER[@]}" --mosaic-only --machine "$label" --scan-id "$random_id")
|
|
||||||
else
|
|
||||||
run_cmd=("${SCRAPER[@]}" --machine "$label" --scan-id "$random_id")
|
|
||||||
fi
|
|
||||||
if [[ -n "${DRY_RUN:-}" ]]; then
|
|
||||||
run_cmd+=(--dry-run)
|
|
||||||
fi
|
|
||||||
if "${run_cmd[@]}"; then
|
|
||||||
log " status: OK — finished this machine (exit 0)"
|
|
||||||
PROCESSED=$((PROCESSED + 1))
|
|
||||||
else
|
|
||||||
rc=$?
|
|
||||||
log " status: FAILED — scraper exit code $rc (stopping; fix or remove this machine and re-run)"
|
|
||||||
exit "$rc"
|
|
||||||
fi
|
|
||||||
log "────────────────────────────────────────"
|
|
||||||
done < "$MACHINES_FILE"
|
|
||||||
|
|
||||||
log "done. summary: $PROCESSED machine(s) with sampled scan download completed, $SKIPPED skipped, $IDX line(s) processed out of $TOTAL_MACHINES in file."
|
|
||||||
exit 0
|
|
||||||
+47
-1
@@ -84,6 +84,33 @@ def parse_args() -> argparse.Namespace:
|
|||||||
"inventorying all scans across all machines."
|
"inventorying all scans across all machines."
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--retry-failed",
|
||||||
|
action="store_true",
|
||||||
|
help=(
|
||||||
|
"Mosaic-only: re-attempt scans whose latest scans.csv row has "
|
||||||
|
"mosaic_download_status=failed (queue from CSV, not the server list). "
|
||||||
|
"Implies --mosaic-only."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--retry-since",
|
||||||
|
metavar="YEAR",
|
||||||
|
default=None,
|
||||||
|
help=(
|
||||||
|
"With --retry-failed: only scans with scan_time year >= YEAR "
|
||||||
|
"(e.g. 2023)."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--retry-error-code",
|
||||||
|
metavar="CODE",
|
||||||
|
default=None,
|
||||||
|
help=(
|
||||||
|
"With --retry-failed: filter by mosaic_error_code "
|
||||||
|
"(e.g. 200 for empty-body failures)."
|
||||||
|
),
|
||||||
|
)
|
||||||
p.add_argument(
|
p.add_argument(
|
||||||
"--dry-run",
|
"--dry-run",
|
||||||
action="store_true",
|
action="store_true",
|
||||||
@@ -159,6 +186,16 @@ def main() -> None:
|
|||||||
if args.scan_id is not None and args.scan_id <= 0:
|
if args.scan_id is not None and args.scan_id <= 0:
|
||||||
sys.exit("--scan-id must be a positive integer")
|
sys.exit("--scan-id must be a positive integer")
|
||||||
|
|
||||||
|
if args.retry_since and not args.retry_failed:
|
||||||
|
sys.exit("--retry-since requires --retry-failed.")
|
||||||
|
if args.retry_error_code and not args.retry_failed:
|
||||||
|
sys.exit("--retry-error-code requires --retry-failed.")
|
||||||
|
|
||||||
|
if args.retry_failed:
|
||||||
|
if args.metadata_only:
|
||||||
|
sys.exit("--retry-failed cannot be used with --metadata-only.")
|
||||||
|
args.mosaic_only = True # implied
|
||||||
|
|
||||||
# --list-machines doesn't need credentials
|
# --list-machines doesn't need credentials
|
||||||
if args.list_machines:
|
if args.list_machines:
|
||||||
base_url = "http://205.149.147.131:8010/"
|
base_url = "http://205.149.147.131:8010/"
|
||||||
@@ -261,7 +298,13 @@ def main() -> None:
|
|||||||
if args.metadata_only:
|
if args.metadata_only:
|
||||||
log.info("Mode: metadata only (mosaics and tiles skipped)")
|
log.info("Mode: metadata only (mosaics and tiles skipped)")
|
||||||
elif args.mosaic_only:
|
elif args.mosaic_only:
|
||||||
log.info("Mode: mosaics only (individual tiles skipped)")
|
if args.retry_failed:
|
||||||
|
log.info(
|
||||||
|
"Mode: mosaic retry (failed scans from %s)",
|
||||||
|
SCANS_CSV_FILENAME,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
log.info("Mode: mosaics only (individual tiles skipped)")
|
||||||
if args.dry_run:
|
if args.dry_run:
|
||||||
log.info("Mode: dry-run (no files will be written)")
|
log.info("Mode: dry-run (no files will be written)")
|
||||||
|
|
||||||
@@ -285,6 +328,9 @@ def main() -> None:
|
|||||||
metadata_only=args.metadata_only,
|
metadata_only=args.metadata_only,
|
||||||
scan_id_filter=args.scan_id,
|
scan_id_filter=args.scan_id,
|
||||||
max_tiles=args.max_tiles,
|
max_tiles=args.max_tiles,
|
||||||
|
retry_failed=args.retry_failed,
|
||||||
|
retry_since_year=args.retry_since,
|
||||||
|
retry_error_code=args.retry_error_code,
|
||||||
)
|
)
|
||||||
totals.merge(stats)
|
totals.merge(stats)
|
||||||
finally:
|
finally:
|
||||||
|
|||||||
+114
-22
@@ -2,14 +2,22 @@
|
|||||||
High-level scrape orchestration: drives the per-machine and per-scan loops.
|
High-level scrape orchestration: drives the per-machine and per-scan loops.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
import csv
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
|
import time
|
||||||
from concurrent.futures import ThreadPoolExecutor, as_completed
|
from concurrent.futures import ThreadPoolExecutor, as_completed
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
|
from tqdm import tqdm
|
||||||
|
|
||||||
from spruce.download_result import PERMANENT_MISSING, UNKNOWN, error_code_str
|
from spruce.download_result import PERMANENT_MISSING, UNKNOWN, error_code_str
|
||||||
|
from spruce.exif import write_mosaic_exif
|
||||||
|
from spruce.paths import machine_dir_name, tile_dest, mosaic_dest, _extract_date
|
||||||
|
from spruce.progress import ProgressTracker, CsvWriter
|
||||||
|
from spruce.session import MachineSession
|
||||||
|
|
||||||
# RootView returns ~43-byte 1×1 JPEG placeholders for empty cells; stay well
|
# RootView returns ~43-byte 1×1 JPEG placeholders for empty cells; stay well
|
||||||
# below smallest observed real tile (~7 KiB in production samples).
|
# below smallest observed real tile (~7 KiB in production samples).
|
||||||
@@ -49,16 +57,64 @@ class RunStats:
|
|||||||
self.scans_probe_skipped += other.scans_probe_skipped
|
self.scans_probe_skipped += other.scans_probe_skipped
|
||||||
self.scans_disk_space_skipped += other.scans_disk_space_skipped
|
self.scans_disk_space_skipped += other.scans_disk_space_skipped
|
||||||
|
|
||||||
from tqdm import tqdm
|
|
||||||
|
|
||||||
from spruce.exif import write_mosaic_exif
|
|
||||||
from spruce.paths import machine_dir_name, tile_dest, mosaic_dest, _extract_date
|
|
||||||
from spruce.progress import ProgressTracker, CsvWriter
|
|
||||||
from spruce.session import MachineSession
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
log = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def _read_scans_csv_latest(scans_csv_path: Path) -> dict[tuple[str, str], dict[str, str]]:
|
||||||
|
"""Last row wins per (machine, scan_id)."""
|
||||||
|
latest: dict[tuple[str, str], dict[str, str]] = {}
|
||||||
|
if not scans_csv_path.exists():
|
||||||
|
return latest
|
||||||
|
with open(scans_csv_path, newline="", encoding="utf-8") as fh:
|
||||||
|
for row in csv.DictReader(fh):
|
||||||
|
key = (row.get("machine", ""), row.get("scan_id", ""))
|
||||||
|
latest[key] = row
|
||||||
|
return latest
|
||||||
|
|
||||||
|
|
||||||
|
def load_failed_scans_from_csv(
|
||||||
|
scans_csv_path: Path,
|
||||||
|
machine_label: str,
|
||||||
|
*,
|
||||||
|
since_year: str | None = None,
|
||||||
|
error_code: str | None = None,
|
||||||
|
) -> list[dict[str, Any]]:
|
||||||
|
"""
|
||||||
|
Dedupe scans.csv by (machine, scan_id); return failed mosaic rows for one machine.
|
||||||
|
|
||||||
|
Each dict is suitable as the ``scan`` argument to ``process_scan`` (scan_id,
|
||||||
|
scan_time, name, status).
|
||||||
|
"""
|
||||||
|
latest = _read_scans_csv_latest(scans_csv_path)
|
||||||
|
out: list[dict[str, Any]] = []
|
||||||
|
for (_m, _sid), row in latest.items():
|
||||||
|
if row.get("machine") != machine_label:
|
||||||
|
continue
|
||||||
|
if row.get("mosaic_download_status") != "failed":
|
||||||
|
continue
|
||||||
|
if error_code is not None and row.get("mosaic_error_code", "") != error_code:
|
||||||
|
continue
|
||||||
|
st = row.get("scan_time", "") or ""
|
||||||
|
if since_year is not None:
|
||||||
|
yr = st[:4]
|
||||||
|
if len(yr) < 4 or yr < since_year:
|
||||||
|
continue
|
||||||
|
sid = int(row["scan_id"])
|
||||||
|
out.append(
|
||||||
|
{
|
||||||
|
"scan_id": sid,
|
||||||
|
"scan_time": st,
|
||||||
|
"name": row.get("name", ""),
|
||||||
|
"status": row.get("status", "") or "Completed",
|
||||||
|
"user": row.get("user", ""),
|
||||||
|
"scan_lines": row.get("scan_lines", ""),
|
||||||
|
"scan_mode": row.get("scan_mode", ""),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
out.sort(key=lambda s: s["scan_id"])
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Per-scan helpers
|
# Per-scan helpers
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
@@ -499,6 +555,9 @@ def scrape_machine(
|
|||||||
metadata_only: bool = False,
|
metadata_only: bool = False,
|
||||||
scan_id_filter: int | None = None,
|
scan_id_filter: int | None = None,
|
||||||
max_tiles: int | None = None,
|
max_tiles: int | None = None,
|
||||||
|
retry_failed: bool = False,
|
||||||
|
retry_since_year: str | None = None,
|
||||||
|
retry_error_code: str | None = None,
|
||||||
) -> RunStats:
|
) -> RunStats:
|
||||||
"""Login, fetch scans, and download all content for one machine."""
|
"""Login, fetch scans, and download all content for one machine."""
|
||||||
sess = MachineSession(machine, config)
|
sess = MachineSession(machine, config)
|
||||||
@@ -518,8 +577,37 @@ def scrape_machine(
|
|||||||
log.error("[%s] Login failed after 3 attempts — skipping machine.", machine["label"])
|
log.error("[%s] Login failed after 3 attempts — skipping machine.", machine["label"])
|
||||||
return RunStats()
|
return RunStats()
|
||||||
|
|
||||||
if scan_id_filter is not None:
|
if retry_failed:
|
||||||
scans: list[dict[str, Any]] = [
|
scans = load_failed_scans_from_csv(
|
||||||
|
scans_csv.path,
|
||||||
|
machine["label"],
|
||||||
|
since_year=retry_since_year,
|
||||||
|
error_code=retry_error_code,
|
||||||
|
)
|
||||||
|
if scan_id_filter is not None:
|
||||||
|
scans = [s for s in scans if s["scan_id"] == scan_id_filter]
|
||||||
|
if not scans:
|
||||||
|
log.warning(
|
||||||
|
"[%s] Retry: scan_id %d not among failed rows for this machine.",
|
||||||
|
machine["label"],
|
||||||
|
scan_id_filter,
|
||||||
|
)
|
||||||
|
return RunStats()
|
||||||
|
log.info("[%s] Mosaic retry: single scan %d.", machine["label"], scan_id_filter)
|
||||||
|
elif not scans:
|
||||||
|
log.warning(
|
||||||
|
"[%s] No failed mosaic rows in scans.csv match retry filters.",
|
||||||
|
machine["label"],
|
||||||
|
)
|
||||||
|
return RunStats()
|
||||||
|
else:
|
||||||
|
log.info(
|
||||||
|
"[%s] Mosaic retry: %d failed scan(s) from scans.csv.",
|
||||||
|
machine["label"],
|
||||||
|
len(scans),
|
||||||
|
)
|
||||||
|
elif scan_id_filter is not None:
|
||||||
|
scans = [
|
||||||
{"scan_id": scan_id_filter, "status": "Completed"}
|
{"scan_id": scan_id_filter, "status": "Completed"}
|
||||||
]
|
]
|
||||||
log.info("[%s] Targeting scan ID %d.", machine["label"], scan_id_filter)
|
log.info("[%s] Targeting scan ID %d.", machine["label"], scan_id_filter)
|
||||||
@@ -529,21 +617,25 @@ def scrape_machine(
|
|||||||
log.warning("[%s] No scans found.", machine["label"])
|
log.warning("[%s] No scans found.", machine["label"])
|
||||||
return RunStats()
|
return RunStats()
|
||||||
|
|
||||||
# Build a set of scan_ids already fully processed in a prior run so we can
|
# Build existing_ids: scan_ids to skip entirely (no metadata fetch, no HTTP).
|
||||||
# skip them entirely (no metadata fetch, no mosaic request).
|
# In normal mode: skip anything with a definitive non-pending status.
|
||||||
# Only scans with a definitive non-pending status count; skipped_metadata_only
|
# In retry mode: only skip scans that are already downloaded or skipped for
|
||||||
# rows still need to be processed in mosaic mode.
|
# disk-space reasons — failed scans must be re-attempted.
|
||||||
PENDING_STATUSES = {"skipped_metadata_only", ""}
|
PENDING_STATUSES = {"skipped_metadata_only", ""}
|
||||||
|
BLOCK_AFTER_RETRY_STATUSES = {"downloaded", "skipped_zero_disk_space"}
|
||||||
existing_ids: set[int] = set()
|
existing_ids: set[int] = set()
|
||||||
if not metadata_only and scans_csv._fh.name:
|
if not metadata_only:
|
||||||
existing_path = Path(scans_csv._fh.name)
|
latest_rows = _read_scans_csv_latest(scans_csv.path)
|
||||||
if existing_path.exists():
|
for (_mlabel, _sid), _row in latest_rows.items():
|
||||||
import csv as _csv
|
if _mlabel != machine["label"]:
|
||||||
with open(existing_path, newline="", encoding="utf-8") as _f:
|
continue
|
||||||
for _row in _csv.DictReader(_f):
|
st = _row.get("mosaic_download_status", "")
|
||||||
if _row.get("machine") == machine["label"]:
|
if retry_failed:
|
||||||
if _row.get("mosaic_download_status", "") not in PENDING_STATUSES:
|
if st in BLOCK_AFTER_RETRY_STATUSES:
|
||||||
existing_ids.add(int(_row["scan_id"]))
|
existing_ids.add(int(_row["scan_id"]))
|
||||||
|
else:
|
||||||
|
if st not in PENDING_STATUSES:
|
||||||
|
existing_ids.add(int(_row["scan_id"]))
|
||||||
|
|
||||||
stats = RunStats()
|
stats = RunStats()
|
||||||
for scan in scans:
|
for scan in scans:
|
||||||
|
|||||||
@@ -77,6 +77,7 @@ class CsvWriter:
|
|||||||
def __init__(self, path: Path, fields: list[str]) -> None:
|
def __init__(self, path: Path, fields: list[str]) -> None:
|
||||||
is_new = not path.exists()
|
is_new = not path.exists()
|
||||||
path.parent.mkdir(parents=True, exist_ok=True)
|
path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
self.path = path
|
||||||
self._fh = open(path, "a", newline="", encoding="utf-8")
|
self._fh = open(path, "a", newline="", encoding="utf-8")
|
||||||
self._writer = csv.DictWriter(self._fh, fieldnames=fields)
|
self._writer = csv.DictWriter(self._fh, fieldnames=fields)
|
||||||
if is_new:
|
if is_new:
|
||||||
|
|||||||
@@ -0,0 +1,133 @@
|
|||||||
|
"""Retry queue loading from scans.csv (mosaic_download_status=failed)."""
|
||||||
|
|
||||||
|
import csv
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from spruce.orchestrator import load_failed_scans_from_csv
|
||||||
|
from spruce.settings import SCANS_CSV_FIELDS
|
||||||
|
|
||||||
|
|
||||||
|
def _blank_row(**kwargs: str) -> dict[str, str]:
|
||||||
|
row = {k: "" for k in SCANS_CSV_FIELDS}
|
||||||
|
row.update(kwargs)
|
||||||
|
return row
|
||||||
|
|
||||||
|
|
||||||
|
def _write_scans_csv(path: Path, rows: list[dict[str, str]]) -> None:
|
||||||
|
with open(path, "w", newline="", encoding="utf-8") as fh:
|
||||||
|
w = csv.DictWriter(fh, fieldnames=SCANS_CSV_FIELDS)
|
||||||
|
w.writeheader()
|
||||||
|
for r in rows:
|
||||||
|
w.writerow({k: r.get(k, "") for k in SCANS_CSV_FIELDS})
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_failed_scans_dedup_keeps_last_row(tmp_path: Path) -> None:
|
||||||
|
path = tmp_path / "scans.csv"
|
||||||
|
common = {
|
||||||
|
"machine": "BW1 [X]",
|
||||||
|
"machine_id": "1",
|
||||||
|
"scan_id": "100",
|
||||||
|
"mosaic_url": "http://x/m.jpg",
|
||||||
|
"mosaic_local_path": "",
|
||||||
|
"mosaic_on_disk": "False",
|
||||||
|
}
|
||||||
|
_write_scans_csv(
|
||||||
|
path,
|
||||||
|
[
|
||||||
|
_blank_row(
|
||||||
|
**common,
|
||||||
|
mosaic_download_status="failed",
|
||||||
|
mosaic_error_code="404",
|
||||||
|
scan_time="2020-01-01",
|
||||||
|
),
|
||||||
|
_blank_row(
|
||||||
|
**common,
|
||||||
|
mosaic_download_status="failed",
|
||||||
|
mosaic_error_code="404",
|
||||||
|
scan_time="2020-06-01",
|
||||||
|
),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
out = load_failed_scans_from_csv(path, "BW1 [X]")
|
||||||
|
assert len(out) == 1
|
||||||
|
assert out[0]["scan_id"] == 100
|
||||||
|
assert out[0]["scan_time"] == "2020-06-01"
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_failed_scans_since_year(tmp_path: Path) -> None:
|
||||||
|
path = tmp_path / "scans.csv"
|
||||||
|
base = {
|
||||||
|
"machine": "M",
|
||||||
|
"machine_id": "1",
|
||||||
|
"mosaic_url": "",
|
||||||
|
"mosaic_local_path": "",
|
||||||
|
"mosaic_on_disk": "",
|
||||||
|
"mosaic_download_status": "failed",
|
||||||
|
"mosaic_error_code": "404",
|
||||||
|
}
|
||||||
|
_write_scans_csv(
|
||||||
|
path,
|
||||||
|
[
|
||||||
|
_blank_row(**base, scan_id="1", scan_time="2022-12-31"),
|
||||||
|
_blank_row(**base, scan_id="2", scan_time="2023-01-01"),
|
||||||
|
_blank_row(**base, scan_id="3", scan_time=""),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
out = load_failed_scans_from_csv(path, "M", since_year="2023")
|
||||||
|
ids = {s["scan_id"] for s in out}
|
||||||
|
assert ids == {2}
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_failed_scans_error_code(tmp_path: Path) -> None:
|
||||||
|
path = tmp_path / "scans.csv"
|
||||||
|
base = {
|
||||||
|
"machine": "M",
|
||||||
|
"machine_id": "1",
|
||||||
|
"scan_time": "2024-01-01",
|
||||||
|
"mosaic_url": "",
|
||||||
|
"mosaic_local_path": "",
|
||||||
|
"mosaic_on_disk": "",
|
||||||
|
"mosaic_download_status": "failed",
|
||||||
|
}
|
||||||
|
_write_scans_csv(
|
||||||
|
path,
|
||||||
|
[
|
||||||
|
_blank_row(**base, scan_id="10", mosaic_error_code="404"),
|
||||||
|
_blank_row(**base, scan_id="11", mosaic_error_code="200"),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
out = load_failed_scans_from_csv(path, "M", error_code="200")
|
||||||
|
assert [s["scan_id"] for s in out] == [11]
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_failed_scans_excludes_downloaded(tmp_path: Path) -> None:
|
||||||
|
path = tmp_path / "scans.csv"
|
||||||
|
base = {
|
||||||
|
"machine": "M",
|
||||||
|
"machine_id": "1",
|
||||||
|
"scan_time": "2024-01-01",
|
||||||
|
"mosaic_url": "",
|
||||||
|
"mosaic_local_path": "",
|
||||||
|
"mosaic_on_disk": "True",
|
||||||
|
}
|
||||||
|
_write_scans_csv(
|
||||||
|
path,
|
||||||
|
[
|
||||||
|
_blank_row(
|
||||||
|
**base,
|
||||||
|
scan_id="5",
|
||||||
|
mosaic_download_status="downloaded",
|
||||||
|
mosaic_error_code="",
|
||||||
|
),
|
||||||
|
_blank_row(
|
||||||
|
**base,
|
||||||
|
scan_id="6",
|
||||||
|
mosaic_download_status="failed",
|
||||||
|
mosaic_error_code="404",
|
||||||
|
),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
out = load_failed_scans_from_csv(path, "M")
|
||||||
|
assert [s["scan_id"] for s in out] == [6]
|
||||||
Reference in New Issue
Block a user