Add sample_random_scans script and first-page list-scans option
- scripts/sample_random_scans.sh: pick a random scan per machine (default: first list page) and download mosaic and/or tiles - --list-scans-first-page-only: one HTTP request for scan list (up to 320 IDs) - scripts/machines.example.txt; .gitignore local machines.txt (copy from example) - README: document usage
This commit is contained in:
@@ -81,6 +81,9 @@ python scraper.py --list-machines
|
||||
# List all scans for a machine
|
||||
python scraper.py --list-scans --machine "BW3-20 [AMR-26]"
|
||||
|
||||
# List only the first table page (one HTTP call; up to 320 — newest/first per server order)
|
||||
python scraper.py --list-scans --list-scans-first-page-only --machine "BW3-20 [AMR-26]"
|
||||
|
||||
# Preview what would be downloaded (dry run)
|
||||
python scraper.py --machine "BW3-20 [AMR-26]" --dry-run
|
||||
|
||||
@@ -94,6 +97,11 @@ python scraper.py --machine "BW3-20 [AMR-26]" --mosaic-only
|
||||
# Download mosaics for all machines
|
||||
python scraper.py --mosaic-only
|
||||
|
||||
# One random completed scan per machine: mosaic + all tiles (from machines.txt; uses --list-scans + --scan-id)
|
||||
# MOSAIC_ONLY=1 ./scripts/sample_random_scans.sh machines.txt # optional: mosaics only, no tiles
|
||||
# cp scripts/machines.example.txt machines.txt # then edit: one label per line
|
||||
# ./scripts/sample_random_scans.sh machines.txt
|
||||
|
||||
# Download all tiles for a specific scan
|
||||
python scraper.py --machine "BW3-20 [AMR-26]" --scan-id 158374 --workers 4
|
||||
|
||||
@@ -115,6 +123,7 @@ python scraper.py --machine "BW3-20 [AMR-26]" --scan-id 158374 --workers 4
|
||||
| `--recheck` | Scan archive for zero-byte/missing tiles and mosaics; remove bad entries from `.progress.json` so they re-download on next run |
|
||||
| `--list-machines` | Print all machines and exit |
|
||||
| `--list-scans` | Print all scans for `--machine` and exit |
|
||||
| `--list-scans-first-page-only` | With `--list-scans`: a single list request (up to 320 scans) instead of paginating the full history |
|
||||
| `--verbose` / `-v` | Debug logging |
|
||||
|
||||
### `config.yaml` (optional keys)
|
||||
|
||||
Reference in New Issue
Block a user