Model-agnostic · Cloudflare-native · SARIF 2.1.0

Yes — a model-agnostic AI harness for scanning code for vulnerabilities.

AIHarness pairs deterministic SAST with evidence-based AI triage. Reproducible scanner rules give you stable, auditable coverage; a model-agnostic adapter (Claude · OpenAI · Gemini) adds context, dedupe, and plain-language remediation — confidence grounded in evidence, not a model's self-rating.

  • BYO-key · envelope-encrypted · shredded
  • Code treated as data — prompt-injection defense
  • Immutable audit log

The pipeline

An auditable scan, end to end

Watch a scan flow left to right. Hover or focus any stage to see what it does and which standards it satisfies.

Select a stage above to inspect what it does and which standards it satisfies.

The hybrid story

Deterministic floor, intelligent ceiling

Scanners give you reproducibility and stable rule IDs. The model gives you context and clarity. Neither is trusted blindly.

01

Deterministic SAST

Semgrep runs first: reproducible, auditable, with stable rule IDs you can pin and diff across runs. The same code yields the same findings — every time.

02

AI triage as a layer, not an oracle

The model adds context, deduplicates, writes plain-language explanations, and suppresses false positives. It fills a strict JSON schema only — code is passed as data, never as instructions.

03

Evidence-based confidence

Confidence is computed from evidence, not the model's self-rating: Semgrep ∩ LLM-confirmed = high; multi-model agreement raises it; LLM-only is labeled needs review.

04

Auditable output

Every scan emits SARIF 2.1.0 plus an immutable audit log — model id, version, prompt hash, ruleset versions — so results are reproducible and defensible.

BYO key is envelope-encrypted, used only for the scan, then shredded. Your code is never stored beyond the job's TTL and never used for training.

Live demo

Scan code right now

This calls the real API — no key required. Load the planted-vulnerability sample, hit run, and watch the harness catch it live in the terminal.

Use your own Anthropic API key (optional)

Optional. If provided, your key is envelope-encrypted, used only for this scan, then shredded. Your code is never stored beyond the job or used for training.

aiharness@cf:~$

Standards & best practices

Mapped to the frameworks your auditors already use

Every reference links to its authoritative source. We align on technical merit.

Executive Order 14110 was revoked in Jan 2025 — we align to SSDF / AI-RMF on technical merit, not the executive order.

Use cases

Where teams deploy AIHarness

CI / PR gate

Block merges over a severity threshold and post SARIF straight to the pull request. Deterministic rule IDs keep the gate stable across runs.

Pre-acquisition code audit

Scan a target's codebase before you sign. Get a defensible, standards-mapped findings report with an immutable audit trail.

OT / ICS repo review

Review operational-technology repositories with findings mapped to ISA/IEC 62443 — built for critical-infrastructure scrutiny.

Second opinion over existing SAST

Layer AIHarness over your current scanner to dedupe noise, explain findings in plain language, and surface evidence-graded confidence.

SBOM & supply-chain check

Generate a CycloneDX SBOM and check dependencies, aligning with SLSA and CISA guidance on software supply-chain integrity.

Trust & data governance

Built so an auditor can trust it

BYO key, shredded

Your Anthropic key is envelope-encrypted, used only for the job, then shredded. We never persist it.

Source kept only for the job

Code is retained only for the job's TTL, never used for training, then discarded.

Prompt-injection defense

Code is passed as data; the model fills a strict JSON schema and cannot be steered by content inside the scanned source.

Reproducible & audited

An immutable audit log records model id, version, prompt hash, and ruleset versions for every scan.

Model-agnostic by interface

A clean adapter abstracts Claude, OpenAI, and Gemini. Swap providers without changing the pipeline.

Honest accuracy posture

Evidence-based confidence. LLM-only findings are labeled “needs review,” never overclaimed as confirmed.

We scan ourselves

Our own self-scan, dated 2026-06-24

If we ask you to trust the harness, we run it on ourselves — and report it honestly.

0production-dependency vulnerabilities
148components in our CycloneDX SBOM
2.1.0SARIF output, schema-validated in CI
  • Manual review clean: BYO-key lifecycle, container path-traversal guard, parameterized SQL, XSS-safe DOM.
  • Our SARIF output is validated against the official SARIF 2.1.0 JSON schema in CI.
  • A CycloneDX SBOM (148 components) is generated as part of the build.
  • LLM-only findings are labeled “needs review” — we don't overclaim.