Additional
Security Harness
Synced from github.com/CoWork-OS/CoWork-OS/docs
The security harness is the first local implementation of the CoWork OS security-discovery loop:
prepare -> scan -> validate/debate -> dedup -> prove -> eval coverage
It is intentionally deterministic and CI-friendly. The harness does not try to replace human security
review or the security-auditor role. It gives changed high-risk files a repeatable first pass and
produces artifacts that Mission Control/Core Harness can surface.
Command
npm run qa:security:harness
Useful flags:
--base <ref>and--head <ref>scan changed files in a git range. Defaults toHEAD~1...HEAD.--files <comma-separated>scans explicit paths.--allscans every tracked file.--out <path>writes the full JSON report. Default:artifacts/security-harness/security-harness-report.json.--mission-control-out <path>writes the Mission Control card payload. Default:artifacts/security-harness/mission-control-findings.json.--db <path> --profile-id <id>also writes aregression_evalcore trace and deduped failure records for Mission Control.--confirmed-fix --fix-id <id> --fix-summary <text>creates or updatesscripts/qa/eval-cases/security-harness-regressions.json.--fail-on-findingsmakes high/critical findings fail the process. The default is advisory so the harness does not make ordinary task verification or agent execution stricter.
Targeting
The prepare stage only scans changed files that touch high-risk boundaries:
- tool policy and security manager code
- agent tools and runtime policy code
- sandbox and process execution code
- Browser Workbench automation surfaces
- Electron IPC/preload/main-process boundaries
- connector source code
- regression policy and the harness itself
This keeps routine documentation, renderer-only styling, and unrelated product changes out of the security queue unless they cross a sensitive boundary.
Validation And Debate
Every scanner candidate must pass a deterministic verifier/debater stage before it becomes a finding. A candidate is confirmed only when it has concrete line evidence and the file is in a configured high-risk boundary. The report records:
- verifier requirement and verdict
- debater counterargument
- proof requirement
- suggested proof or regression shape
That mirrors the rule for human review: a security issue should graduate with evidence and a path to proof, not just a suspicious pattern.
Mission Control
The harness always writes a Mission Control payload:
artifacts/security-harness/mission-control-findings.json
When --db and --profile-id are provided, it also creates a Core Harness trace using
trace_kind = regression_eval and inserts deduped core_failure_records. The DB mode is optional so
CI and local development can run without needing an app profile.
The harness is not part of the ordinary task verifier path. It does not alter verified mode,
agent step completion, task-list verification, approval policy, or final-answer gates.
Eval Coverage
For confirmed security or production-policy fixes, run:
npm run qa:security:harness -- --confirmed-fix --fix-id <incident-or-pr-id> --fix-summary "Short fix summary"
This updates scripts/qa/eval-cases/security-harness-regressions.json with one category per confirmed
finding, or a production-policy placeholder when the fix removes the original finding. The existing
regression policy can then enforce that production/security fixes leave durable eval coverage.