Why AI Governance Needs to Move from PDF Policies to Runtime Enforcement

If you work in IT, you already know what governance looks like. You have policies. You have compliance frameworks. You have audit trails. These exist for infrastructure, for data, for access control.

But when it comes to AI, most organizations are still governing with PDF documents and hope.

The problem is simple: an AI agent can pass every compliance review on paper and still drift from its intended behavior in production. That’s because most AI guardrail tools are stateless. They evaluate each request independently and move on. No memory. No trend analysis. No way to catch an agent that gives individually reasonable answers while gradually shifting away from its policies over time.

This is why I built SAFi,  an open-source runtime governance engine for AI agents.

What SAFi Does

SAFi sits between the user and the LLM and enforces policies at runtime. Every response is evaluated, scored, and logged before it reaches the user. If a response violates policy, it gets blocked. If it passes, it still gets audited after the fact for compliance.

The architecture uses a separation of concerns that should feel familiar to anyone who has worked with governance frameworks:

Values  act as the constitution. These are your policies, defined once and protected from modification at runtime.
Intellect is the LLM itself. It proposes a response based on the user’s query and the active policy.
Will is the gatekeeper. It evaluates the proposed response against the policy and either approves or rejects it. If it rejects, the response never reaches the user.
Conscience  is the auditor. It runs after the response is delivered and scores it against each policy value.
Spirit is the long-term monitor. It tracks behavioral consistency over time using mathematical drift detection (cosine similarity against an exponential moving average). No LLM involved in this step. Pure math.

The result is a full audit trail for every AI interaction: what was proposed, whether it was approved, how it scored, and whether the agent is drifting from its baseline.

Why This Matters for IT Governance

If your organization is deploying AI agents (chatbots, copilots, customer support bots, internal tools), you need the same governance rigor you apply to everything else.

Stateless guardrails are not enough They are the equivalent of checking each financial transaction individually without ever reconciling the books. Each transaction might look fine, but the trend could be moving in the wrong direction.

SAFi is stateful. It remembers how the agent has behaved, compares every new response against a rolling baseline, and flags when something changes. In production, I’ve tracked over 1,600 interactions with one agent and maintained 97.9% long-term behavioral consistency with a 98.7% approval rate.

The system even predicted a weakness in one agent’s reasoning about justice before an adversary exploited it. That’s the kind of early warning you want in a governance tool.

How to Get Started

SAFi is completely free and open source. You can deploy it with Docker in minutes:

```bash
docker pull amayanelson/safi:v1.2

docker run -d -p 5000:5000 \
-e DB_HOST=your_db_host \
-e DB_USER=your_db_user \
-e DB_PASSWORD=your_db_password \
-e DB_NAME=safi \
-e OPENAI_API_KEY=your_openai_key \
--name safi amayanelson/safi:v1.2
```

You can also use it as a headless “Governance-as-a-Service” layer for existing applications. SAFi exposes an API that any external bot or agent framework (LangChain, custom bots, Microsoft Teams integrations) can call to get governed responses.

Check out the full source code on GitHub. read the philosophical foundation behind the architecture.

Final Thought

We don’t let infrastructure run ungoverned. We don’t let databases run without audit logs. AI should be no different. The tools to govern AI at runtime exist. It’s time to start using them.

More IT Frameworks Tutorials