written for the CISO skim early access — founder-delivered in development — no certifications claimed

This is why you can hand agents real, money-making work.

An autonomous AI corporation only earns the right to do the work — touch budgets, call tools, run unattended — if it cannot be turned against you. Most agent projects never get there: Gartner predicts over 40% of agentic-AI projects will be canceled by the end of 2027, with only a small fraction in production today. The Agentic Corporation is built the way you'd build it after the incident — before the incident — so the autonomy is trustworthy by construction. Every claim on this page is tied to the mechanism that enforces it.

the one idea everything rests on

Humans hold the root of trust. Agents run below it.

The whole architecture is one shape: a higher, human-rooted tier holds the things that decide what is safe — the Charter, policy, signing keys, the evaluators that judge agents, and the kill-switch. The agents run in a strictly lower tier, sandboxed and deny-by-default. Control flows down; nothing reaches up. An agent can never raise its own budget, expand its own grants, edit policy, or touch the gate that judges it.

Trust-tier boundary: a higher human-rooted tier holds the charter, policy, signing keys, evaluators and the human-held kill-switch, changed only by two-person GitOps. Below a one-way dashed boundary sits the agent tier — sandboxed and deny-by-default — running missions under the rules above. Control flows down only; agents can never reach up across the boundary. HUMAN-ROOTED TIER · changed only by 2-person GitOps Charter · policy · signing keys · evaluators & graders Human-held kill-switch — agents cannot trip or untrip it one-way trust boundary — agents can never reach up control flows down no write path up AGENT TIER · sandboxed, deny-by-default Orchestrator · specialists · reviewer · security officer Run missions under the rules above — never able to change them
The product's core idea, made visual. Everything that defines "safe" lives in the human-rooted tier and changes only through a two-person, human-only pipeline. The agents that do the money-making work run beneath it, with no path to reach up — so handing them real authority stays defensible.

the three invariants

Three rules gate every change — enforced in architecture, not promised in a PDF

These are not guidelines and they are not per-component. They hold platform-wide. A change that violates any one of them does not ship — regardless of evals, benchmarks, or how convenient it is. They are the reason an autonomous corporation can be trusted with real work.

1 — Data is never instructions

Every ingress — agent-to-agent messages, RAG and memory retrievals, tool outputs, free-text fields in configuration, even the telemetry the security operations crew ingests — is typed as untrusted data. It is never concatenated into a privileged prompt as if it were a command. Externally-influenced strings are tainted by default and rendered as inert data all the way up to the operator's screen, so neither the model nor the human can be socially engineered by attacker text styled as system text.

mechanismspotlighting + typed, capability-scoped channels at every ingress; taint tracking on externally-influenced strings; a dual-LLM guardrail plane keeps untrusted readers privilege-free

2 — A hard, human-rooted trust tier

Evaluators and graders, the policy compiler, base policy, signing keys, the Charter, the kill-switch, and the security operations center sit in a strictly higher trust tier that is rooted in humans. The self-improvement loop, GitOps automation, and anything agent-authored can never reach across that boundary. An agent's identity cannot modify its own grants, its own budget, or the gate that judges it.

mechanismtwo-person, human-only GitOps for every higher-tier artifact; agents structurally lack write paths across the boundary; a capability ratchet means a self-proposed change can never expand a role beyond its Charter

3 — Safety is a hard constraint, not a score

Agent corporations compete and self-improve — an evolutionary optimizer that will breed the best rule-bender unless safety sits outside the optimized metric. So a safety violation disqualifies; it is never traded off against performance. A timeout, an error, or an ambiguous result means deny, never "proceed".

mechanismfail-closed policy engine; budget pre-flight that debits before every model and tool call and refuses at zero; disqualifying eval and red-team gates outside any optimized score

posture

The threat model comes first

The whole promise of the platform is that you can charter a corporation and let its agents do the day-to-day work that makes money — unattended, with real authority. That promise is only honest if the authority can't be hijacked. Agents read attacker-influenced content all day: web pages, documents, messages from other agents, tool output, retrieved memory. Any of it can carry instructions. Meanwhile the agents themselves hold delegated authority — tools, data access, budgets — which makes every agent a potential confused deputy and every self-improvement loop a potential privilege-escalation path. The platform's architecture is organized around exactly those two facts, and the threat model, architecture decision records, and assumptions register are maintained as living documents in the repository.

invariant 1 — in depth

Data is never instructions

Every ingress — agent-to-agent messages, RAG and memory retrievals, tool outputs, free-text fields in configuration, even the telemetry the security operations crew ingests — is typed as untrusted data. It is never concatenated into a privileged prompt as if it were a command. Externally-influenced strings are tainted by default and rendered as inert data all the way up to the operator's screen, so neither the model nor the human can be socially engineered by attacker text styled as system text.

mechanismspotlighting + typed, capability-scoped channels at every ingress; taint tracking on externally-influenced strings

mechanismdual-LLM guardrail plane: quarantined readers ingest untrusted content but hold no privileges; privileged actors never read untrusted content directly

mechanismintent capsules — signed, single-use mandates that bind a privileged action to a specific human-rooted intent

invariant 2 — in depth

A hard, human-rooted trust tier

Evaluators and graders, the policy compiler, base policy, signing keys, the Charter, the kill-switch, and the security operations center sit in a strictly higher trust tier that is rooted in humans. The self-improvement loop, GitOps automation, and anything agent-authored can never reach across that boundary. Anything in the higher tier changes only through a human-only, two-person pipeline — never by an agent. An agent's identity cannot modify its own grants, its own budget, or the gate that judges it.

mechanismtwo-person, human-only GitOps for every higher-tier artifact; agents structurally lack write paths across the boundary

mechanismcapability ratchet: a self-proposed change can never expand a role beyond its Charter

mechanismseparation of powers in every loop: actor, auditor, and approver are different agents — ideally different models — and no agent approves its own change

invariant 3 — in depth

Fail-closed everything

Deny-by-default is the resting state of the platform: tool access, network egress, name resolution, and capability grants all start at zero and are opened by explicit, reviewed grants. A timeout, an error, or an ambiguous result means deny — never "proceed". This is what makes handing agents a budget defensible: a corporation can run unattended because the worst case is a refusal, not a runaway. Safety and compliance are disqualifying constraints that sit outside any optimized metric, because a competitive self-improvement loop will otherwise breed the best rule-bender.

mechanismbudget pre-flight: spend is debited before every model and tool call; at zero the platform refuses — a runaway agent stops at the cap you signed

mechanismpolicy compiled from typed agent specifications into a deny-by-default engine; unresolvable references fail closed at admission

mechanismescalation gates default to reject on timeout; an unsigned Charter will not run

deception

Honeypots are alarms, never data

The platform seeds decoy credentials, decoy tools, and canary records through agent-reachable space. Marked artifacts have no legitimate use: any agent or process that follows, fetches, or authenticates with one trips an alarm — detect and alarm only, never act. Trap marking happens exclusively at the higher, agent-unreachable tier, so a compromised agent cannot enumerate which records are traps. Even the operator UI renders a tripped trap as an opaque alarm with zero affordances — no link, no content, nothing to follow.

Honeypot flow: any agent contact with a marked decoy hits a single detection chokepoint, which raises an assertive alarm carrying tenant, identity and access path — and explicitly never acts on, follows, or surfaces the trap. agent touches a marked decoy single detection chokepoint ALARM: tenant · id · path detect + alarm only — never follow, fetch, authenticate, or surface
A trap has exactly one job: be touched, then sound. The detection path is the only thing that ever reads a marked artifact — agents see nothing to act on, so the trap cannot be turned into a lure.

mechanismreserved trap markers and taints with a single detection chokepoint; tripping one raises an assertive alarm with tenant, identity, and access path

mechanismtraps are marked above the agent tier; the marking store is not readable from agent space

supply chain

Everything that loads is signed

Container images, agent skills, and model weights are pinned by immutable digest — never tags. A skill bundle is accepted only with a cosign signature, an SBOM, and provenance attestation; verification failure blocks the load. High-privilege skills require multiple signatures from distinct anchored keys, one of them human. Before promotion, bundles are detonated in a behavioral sandbox. Skill names resolve through an explicit allow-map — no fuzzy matching — which structurally defeats typosquatting and dependency confusion.

mechanismcosign signature + SBOM + provenance verified against a human-provisioned trust-anchor allow-list; zero anchors means every signature is rejected

mechanismthreshold signing for high-privilege skills: more than one distinct anchored key, including a human key — the "human" property comes from the anchor, never from the manifest

mechanismbehavioral detonation before promotion; transparency-log inclusion recorded; revocation honored — a revoked skill never loads

identity & isolation

Per-agent identity, brokered secrets, sandboxed compute

Every agent workload gets its own cryptographic identity (SPIFFE) and its own service account — least privilege per role, short-TTL sender-constrained tokens. Agents never hold long-lived secrets: the tool gateway attaches just-in-time credentials at the edge, and telemetry is redacted at the SDK layer so secrets never appear in agent context, logs, traces, or eval transcripts. Workloads run under sandbox tiers (gVisor, Firecracker, WASM) chosen per role, and tenant memory is encrypted and isolated per corporation. These mechanisms are enforced in the platform code and pass offline tests; production-scale validation on a live cluster is in progress.

mechanismSPIFFE identity scheme + attestation claims; one service account per workload

mechanismcredential brokering at the gateway edge — the agent's context never contains the secret

mechanismsandbox runtime classes per role; encrypted, tenant-scoped memory with an audited ledger

controls

Two-person controls where it counts

The artifacts that define safety — policy, graders, signing keys, budget caps, the kill-switch, the compiler that turns specifications into policy — change only via a two-person, human-only pipeline. In the operator UI, destructive actions follow a confirm-arm-fire pattern, and trust-tier-crossing actions require a second authenticated human. The kill-switch is per-corporation and human-held: named humans and the incident response team can trip it; agents cannot, and agents cannot untrip it.

mechanismtwo-person GitOps pipeline for higher-tier change; two-person gate in the control UI for trust-tier actions

mechanismper-corp kill-switch policy declared in the signed Charter, distinct from the global kill

evidence

Audit and observability you can verify

Every consequential event — tool call, grant decision, skill verification, kill, escalation — is appended to a hash-chained audit ledger: tamper-evident by construction, verifiable link by link. OpenTelemetry traces follow every mission end to end. An agentic security operations crew, paired with standing red- and blue-team crews, consumes that telemetry — and per invariant 1, treats it as untrusted data, because attackers write logs too.

Hash-chained audit ledger: each event block carries the hash of the block before it, so any tampering breaks the chain and surfaces as a first-class alarm. Verification runs link by link across blocks N through N plus 2. event N prev: …a1f hash: 9c4… event N+1 prev: 9c4… hash: 7be… event N+2 prev: 7be… hash: 2d0… each block carries the prior hash — break a link and the chain raises a first-class alarm
Tamper-evidence is structural: every event commits to the one before it. You don't trust the log because we say so — you verify it, link by link.

mechanismhash-chained audit-event schema; chain-linkage verification surfaces tampering as a first-class alarm

mechanismred-team breach-and-attack simulation as a standing suite, run against staging — a safety violation disqualifies a change outright

operator's view

The control surface humans actually use

All of the above shows up in one place: the operator control surface where named humans hold the root of trust — the Command Bridge, the security operations view, the hash-chained audit ledger, and tripped-honeypot alarms. It is a separate, internet-isolated operator UI, distinct from the customer portal, and it is where the kill-switch and two-person gates live.

Operator control surface

The dark governance console — Command Bridge · security operations · hash-chained audit ledger · honeypot alarms — is a separate operator UI. Real screenshots are added here as it ships; until then this slot stays an honest placeholder, never a staged image.

straight talk

What we do not claim

  • No compliance certifications are claimed today. When audits complete, we will say so plainly.
  • No production customers or uptime figures are claimed. The platform is in active development.
  • No "AI safety solved" claims. The invariants reduce blast radius and make violations disqualifying and visible; they do not make models infallible.

What you can verify instead: the architecture described here is the architecture in the repository — contracts, policy code, threat model, and the eval and red-team harnesses that gate every change. That is the point of this page. You bring zero AI expertise and pick the business; these invariants are how the agents can be trusted to do the work that makes it run. Early access is founder-delivered: we set up your corporation with you.

Questions? Email a human — a person replies: aipeteaipete@gmail.com