Scripts Overview
Use this command prefix for all script commands:
./
All commands below assume you are in dotlineform-site/.
Local environment variables (required for media/generation scripts):
export DOTLINEFORM_PROJECTS_BASE_DIR="/path/to/dotlineform"
export DOTLINEFORM_MEDIA_BASE_DIR="/path/to/dotlineform-icloud"
Sorting behavior and consistency contract:
docs/sorting-architecture.md
Deferred improvements and follow-up items:
docs/backlog.mddocs/css-audit-spec.mddocs/css-audit-latest.md
Main Pipeline
Run everything (copy -> srcset -> page generation):
./scripts/run_draft_pipeline.py --dry-run
./scripts/run_draft_pipeline.py
Useful flags:
--dry-run: preview only (no workbook writes/deletes)--force-generate: pass--forcethrough togenerate_work_pages.py--jobs N: srcset parallel jobs (default:4, orMAKE_SRCSET_JOBSenv var)--mode work|work_details|moment: run only selected flow(s). Repeat flag to run multiple.--work-ids,--work-ids-file: limit work + work_details scope--series-ids,--series-ids-file: pass series scope to generation--moment-ids,--moment-ids-file: limit moment scope--xlsx PATH: workbook path override--input-dir,--output-dir: works source/derivative dirs--detail-input-dir,--detail-output-dir: work_details source/derivative dirs--moment-input-dir,--moment-output-dir: moments source/derivative dirs
Mode examples:
./scripts/run_draft_pipeline.py --mode moment --dry-run
./scripts/run_draft_pipeline.py --mode work --mode work_details --dry-run
./scripts/run_draft_pipeline.py --mode moment --moment-ids blue-sky,compiled --dry-run
./scripts/run_draft_pipeline.py --mode work --work-ids 00456 --dry-run
Note: when --mode work is used and no --series-ids* are provided, draft series are auto-included in generation.
Individual Scripts
1) Copy draft source images from workbook
Unified script with mode flags:
./scripts/copy_draft_media_files.py --mode work --ids-file /tmp/work_ids.txt --copied-ids-file /tmp/copied_work_ids.txt --write
./scripts/copy_draft_media_files.py --mode work_details --ids-file /tmp/detail_uids.txt --copied-ids-file /tmp/copied_detail_uids.txt --write
./scripts/copy_draft_media_files.py --mode moment --ids-file /tmp/moment_ids.txt --copied-ids-file /tmp/copied_moment_ids.txt --write
Flags:
--mode work|work_details|moment--ids-file: optional filter manifest (one ID per line)--copied-ids-file: optional output manifest of successfully copied IDs--write: perform copy (omit for dry-run)--keep-ext/--no-ext: keep/remove source extension in copied filename
2) Build srcset derivatives
MAKE_SRCSET_WORK_IDS_FILE=/tmp/copied_work_ids.txt \
MAKE_SRCSET_2400_IDS_FILE=/tmp/work_2400_ids.txt \
MAKE_SRCSET_SUCCESS_IDS_FILE=/tmp/work_success_ids.txt \
bash scripts/make_srcset_images.sh \
"$DOTLINEFORM_MEDIA_BASE_DIR/works/make_srcset_images" \
"$DOTLINEFORM_MEDIA_BASE_DIR/works/srcset_images" \
4
Moments example (no 2400):
: > /tmp/empty_2400_ids.txt
MAKE_SRCSET_WORK_IDS_FILE=/tmp/copied_moment_ids.txt \
MAKE_SRCSET_2400_IDS_FILE=/tmp/empty_2400_ids.txt \
MAKE_SRCSET_SUCCESS_IDS_FILE=/tmp/moment_success_ids.txt \
bash scripts/make_srcset_images.sh \
"$DOTLINEFORM_MEDIA_BASE_DIR/moments/make_srcset_images" \
"$DOTLINEFORM_MEDIA_BASE_DIR/moments/srcset_images" \
4
3) Generate Jekyll pages from workbook
./scripts/generate_work_pages.py data/works.xlsx
./scripts/generate_work_pages.py data/works.xlsx --write
Common scoped runs:
./scripts/generate_work_pages.py data/works.xlsx --work-ids 00456 --write
./scripts/generate_work_pages.py data/works.xlsx --work-ids-file /tmp/work_ids.txt --write
./scripts/generate_work_pages.py data/works.xlsx --series-ids curve-poems,dots --write
Useful flags:
--write: persist file/workbook changes--force: regenerate even when checksums match--work-ids,--work-ids-file--series-ids,--series-ids-file--moment-ids,--moment-ids-file--works-files-dir(defaultassets/works/files)--moments-sheet(defaultMoments)--moments-output-dir(default_moments)--moments-prose-dir(default_includes/moments_prose)--projects-base-dir: base path used for source-image dimension reads- default is taken from
DOTLINEFORM_PROJECTS_BASE_DIR
- default is taken from
--series-index-json-path(defaultassets/data/series_index.json)--works-index-json-path(defaultassets/data/works_index.json)--work-details-index-json-path(defaultassets/data/work_details_index.json)--only: limit generation to selected artifacts- allowed:
work-pages,works-curator-pages,work-files,series-pages,series-index-json,work-details-pages,work-json,works-index-json,work-details-index-json,moments - coupling:
works-curator-pagesruns only when explicitly included in--only
work-pages: writes_works/<work_id>.mdas lightweight stubs (work_id,title,layout,checksum) plus optional prose includeseries-pages: writes_series/<series_id>.mdas lightweight stubs (series_id,title,layout,checksum) plus prose includework-details-pages: writes_work_details/<detail_uid>.mdas lightweight stubs (work_id,detail_id,detail_uid,title)series-index-json: writesassets/data/series_index.json(full rebuild) with:- header:
schema, deterministic contentversion,generated_at_utc,count - series map keyed by
series_id - full series metadata used by generated series pages (
layout,status,published_date,title,title_sort,sort_fields,series_type,year,year_display,primary_work_id,notes,project_folders,checksum) - ordered
works(in canonical series sort order derived fromsort_fields) andthumbselection
- header:
works-index-json: writesassets/data/works_index.jsonas a lightweight object keyed bywork_id- each work keeps backward-compatible
series_idas the first series and adds orderedseries_ids - runtime thumb paths are derived from
work_id, so no media/thumb payload is persisted here - always rebuilt as a full index (not scoped by
--work-ids)
- each work keeps backward-compatible
work-details-index-json: writesassets/data/work_details_index.jsonas a lightweight object keyed bydetail_uid- always rebuilt as a full index (not scoped by
--work-ids)
- always rebuilt as a full index (not scoped by
work-json: writesassets/works/index/<work_id>.jsonwithheaderversion/checksums, fullwork, and fullsections[].details[]work.series_idremains the first series andwork.series_idspreserves the full ordered membership list from the workbook- work-driven: emits one file per selected work_id (uses
sections: []when a work has no details)
- allowed:
Runtime canonical data flow:
/series/and/series/<series_id>/readassets/data/series_index.json./series/<series_id>/also readsassets/data/works_index.jsonfor card metadata./works/<work_id>/series nav/counter/link visibility readassets/data/series_index.json./work_details/readsassets/data/work_details_index.json.
3b) Tag Studio local save server
Run:
python3 scripts/studio/tag_write_server.py
Optional flags:
--port 8787: override port--repo-root /path/to/dotlineform-site: override root auto-detection (_config.ymlparent search)--dry-run: validate and return response without writing files
Behavior:
- Exposes:
GET /healthPOST /save-tagsPOST /import-tag-registryPOST /import-tag-aliasesPOST /delete-tag-aliasPOST /mutate-tag-alias-previewPOST /mutate-tag-aliasPOST /promote-tag-alias-previewPOST /promote-tag-aliasPOST /import-tag-assignments-previewPOST /import-tag-assignmentsPOST /demote-tag-previewPOST /demote-tagPOST /mutate-tag-previewPOST /mutate-tag- Tag Studio page probes
/healthand shows: Save mode: Local serverwhen availableSave mode: Offline sessionwhen unavailable or when a staged local row already exists for the current seriesPOST /save-tagsexpects assignment objects intags:- series save payload:
{ "series_id": "<series>", "tags": [...] } - work override save payload:
{ "series_id": "<series>", "work_id": "<work_id>", "keep_work": true|false, "tags": [...] } { "tag_id": "<group>:<slug>", "w_manual": 0.3|0.6|0.9, "alias"?: "<alias>" }aliasis optional historical data only; it records that the tag was chosen from an alias match and is not treated as canonical
- series save payload:
- save writes
assets/studio/data/tag_assignments.jsonwith object-only tag rows (no string tags) - save is diff-based in the Series Tag Editor: the UI compares current series/work state against the last loaded/saved baseline and sends one
/save-tagsrequest for the series row when needed plus one request per changed work row - when multiple work pills are selected in the Series Tag Editor, the active work’s current override set is used as the persisted state for all selected work pills
- work override saves strip tags already inherited from
series[*].tags keep_work: falseplus empty tags deletesseries[*].works[work_id]keep_work: trueallows an explicit work row withtags: []- offline-session staging stores full normalized series rows in browser
localStorage, including optional historicalalias - Series Tags page can export that session as JSON or preview/apply it through the local server
- assignment import preview/apply compares full normalized rows, including
alias, and resolves conflicts per series viaoverwriteorskip
- Tag Registry page probes
/healthand shows:Import mode: Local serverwhen availableImport mode: Patchwhen unavailable (fallback to manual patch copy)- tag edit/delete requires local server mode
New tagbutton opens create modal:- group selected via group pills; slug + optional description entry
- live duplicate check blocks existing
<group>:<slug> - local mode uses
POST /import-tag-registrywithmode: addand a single tag payload - patch mode emits add-tag row snippet
- Import modes supported by endpoint:
add(no overwrite)merge(add + overwrite)replace(replace entire registry)- successful import responses include
summary_text(same format used by Tag Registry UI and server log) - import request may include
import_filename; server logs basename only (no client path) - tag mutation endpoint behavior (
POST /mutate-tag):action: edit: update canonical tagdescription(tag id/name remains fixed in UI flow)action: delete: remove tag- delete cascades update
tag_assignments.jsonandtag_aliases.json - aliases that become 1:1 self-maps (
alias == target slug) are removed automatically
- preview endpoint (
POST /mutate-tag-preview) returns the same impact stats without writing files - tag demotion behavior:
- trigger via tag pill
<-action in registry list - preview via
POST /demote-tag-previewis required before confirm - apply via
POST /demote-tag - demotion removes canonical tag from registry, creates alias (
<slug>-> chosen canonical targets), and rewrites assignments/alias target refs - patch fallback emits ordered manual steps only
- trigger via tag pill
- Tag Aliases page probes
/healthand shows:Import mode: Local serverwhen availableImport mode: Patchwhen unavailable (fallback to manual patch copy)- alias pill
×delete:- local mode uses
POST /delete-tag-alias - patch mode generates manual snippet with
aliases_to_remove
- local mode uses
- alias text click opens edit modal:
- live alias-name uniqueness validation
- editable description + selected tags
- selected tags must be canonical and satisfy: max 4 tags, max 1 per group
- local mode uses
POST /mutate-tag-alias - patch mode emits ordered
set_alias/remove_alias_keysteps
New aliasbutton opens create modal:- same alias/tag validation and tag-picker behavior as edit modal
- local mode uses
POST /import-tag-aliaseswithmode: addand a single alias payload - patch mode emits add-alias fragment snippet
- alias pill
→promote:- user chooses target group at action time
- preview via
POST /promote-tag-alias-previewis required before confirm - apply via
POST /promote-tag-alias - canonical tag id is
<group>:<alias-slug>; label auto-derived from slug - if canonical already exists, alias key is removed only
- patch fallback emits ordered manual steps only
- Import modes supported by endpoint:
add(no overwrite)merge(add + overwrite)replace(replace entire aliases)
- successful import responses include
summary_textandimport_filename(basename only)
Security constraints:
- Binds to loopback interface only (local machine only)
- CORS allows loopback origins only
- Write target is allowlisted to these files only:
assets/studio/data/tag_assignments.jsonassets/studio/data/tag_registry.jsonassets/studio/data/tag_aliases.json- timestamped backups are created in
var/studio/backups/:tag_assignments.json.bak-YYYYMMDD-HHMMSStag_registry.json.bak-YYYYMMDD-HHMMSStag_aliases.json.bak-YYYYMMDD-HHMMSS
Script logging:
- Per-script logs are written to repo-root log directories (auto-created).
- Current pipeline/logged scripts include:
scripts/run_draft_pipeline.py->logs/run_draft_pipeline.logscripts/generate_work_pages.py->logs/generate_work_pages.logscripts/studio/tag_write_server.py->var/studio/logs/tag_write_server.log
- Log format is JSON Lines (one JSON object per line).
- Retention policy:
- keep entries from the last 30 days
- if no entries fall inside the last 30 days, keep the latest 1 day’s worth (based on newest entry)
3c) CSS token audit
Run:
python3 scripts/css_token_audit.py
Optional flags:
--md-out docs/css-audit-latest.md: override Markdown output pathassets/css/main.css assets/studio/css/studio.css: optional file list override
Behavior:
- scans CSS for
font-sizedeclarations and color literals - reports repeated raw typography values and direct color literals
- writes the current snapshot to
docs/css-audit-latest.md
4) Audit site consistency (read-only)
Run an audit across generated pages and JSON:
./scripts/audit_site_consistency.py --strict
Scope and output options:
./scripts/audit_site_consistency.py \
--checks cross_refs,schema,json_schema,links,media,orphans \
--series-ids collected-1989-1998 \
--json-out /tmp/site-audit.json \
--md-out docs/audit-latest.md \
--strict
Run a single check with the convenience alias:
./scripts/audit_site_consistency.py \
--check-only schema \
--max-samples 10
Or run multiple checks with repeated --check-only:
./scripts/audit_site_consistency.py \
--check-only cross_refs \
--check-only json_schema \
--series-ids collected-1989-1998
Current checks:
cross_refs: validates key references across_works,_series,_work_details,assets/data/series_index.json, andassets/data/work_details_index.json(including duplicate IDs)schema: validates required front matter fields by collection and format/consistency checks (work_id,detail_uid, slug-safe IDs,sort_fieldstoken rules withwork_idlast sourced from canonicalseries_index.jsonwith_seriesfallback, optional_works.series_idslug format, anddetail_uidprefix matchingwork_id)json_schema: validates generated JSON structure/count consistency for:assets/data/series_index.jsonassets/data/works_index.jsonassets/data/work_details_index.jsonassets/works/index/*.json
links: validates sitemap source/URL targets and query-parameter contract sanity across generated pagesmedia: validates expected local thumbs/download files for published_worksand_work_details(primaries are treated as remote-hosted and are not asserted locally)orphans: reports orphan pages/JSON; optionally include orphan media files with--orphans-media
--strict behavior and value:
--strictexits non-zero when audit errors are found (errors > 0), so scripts/CI can fail fast.- Warnings do not fail the run under
--strict. - Without
--strict, the audit is informational and exits zero.
Query contract map used by links check:
| flow | produced query keys | destination accepts |
|---|---|---|
series -> work |
series, series_page |
series, series_page, from, return_sort, return_dir, return_series, details_section, details_page |
works index -> work |
from, return_sort, return_dir, return_series |
series, series_page, from, return_sort, return_dir, return_series, details_section, details_page |
work -> work_details index |
from_work, from_work_title, section, section_label, series, series_page |
sort, dir, from_work, from_work_title, section, section_label, series, series_page |
work -> work_details page |
from_work, from_work_title, section, details_section, details_page, series, series_page |
from_work, from_work_title, section, series, series_page, details_section, details_page, section_label |
work_details page -> work |
series, series_page, details_section, details_page |
series, series_page, from, return_sort, return_dir, return_series, details_section, details_page |
Orphan media scan (optional):
./scripts/audit_site_consistency.py \
--check-only orphans \
--orphans-media
Markdown report is written by default to docs/audit-latest.md (overwrites each run).
To write to a different path:
./scripts/audit_site_consistency.py \
--md-out /tmp/site-audit.md
Known limits:
mediaassumes primaries are remote-hosted and checks local thumbs/downloads only.json_schemavalidates structure/counts, not recomputed payload hash integrity.linksquery-contract checks are static sanity checks; they do not execute browser flows.- Orphan checks currently focus on works/series/work_details artifacts.
Warning policy:
- Treat schema warnings as backlog by default.
- Current warning rule:
_works.title_sortis only warned whentitlecontains digits andtitle_sortis missing.
5) Autofix missing numeric title_sort on works
Dry-run:
./scripts/fix_missing_title_sort.py
Write changes:
./scripts/fix_missing_title_sort.py --write
Scope to selected IDs/ranges:
./scripts/fix_missing_title_sort.py \
--work-ids 66-74,38,40 \
--write
Works download files
If Works.download is set, generation also copies that file and links it on the work page.
- Source path:
[projects-base-dir]/projects/[project_folder]/[download]
- Destination path:
assets/works/files/[work_id]-[filename.ext]
- Work page link:
- Label:
download - Link text:
filename.ext - Rendered before
cat. <work_id>
- Label: