All Articles
AI & Machine Learning6 min read

How to Make Your AI Agents Follow the Rules (Every Single Time)

Greg (Zvi) Uretzky

Founder & Full-Stack Developer

Share
Paper figure 1

You want to automate customer service, compliance checks, or financial approvals with AI agents.

You’ve built the workflow. You’ve written the prompts. You’ve told the AI the rules: “Only managers can approve refunds over $500,” or “Never share a customer’s full credit card number.”

But deep down, you’re nervous. What if it doesn’t listen? What if it shares data it shouldn’t? You’re right to worry.

Putting rules in a prompt doesn’t guarantee an AI will follow them. It’s like telling a new employee the company policies during orientation but having no way to check if they actually follow them later. Without enforcement, using AI for sensitive tasks is a ticking time bomb.

What Researchers Discovered: A Security Guard for Your AI Team

A team from the University of Wisconsin-Madison and VMware Research tackled this exact problem. They built a tool called PCAS, which stands for Policy Compiler for Secure Agentic Systems.

Their core discovery was simple but critical: You cannot trust natural language prompts for policy enforcement. AI agents are probabilistic—they might follow your rules today, but there’s no guarantee they will tomorrow. For business-critical tasks, you need deterministic enforcement. The rule either passes or it fails. Every time.

Paper figure 1

Figure 1 shows how PCAS sits between your AI agents and your data, checking every action against a rulebook before it happens.

They also found that tracking information flow is non-negotiable. Business workflows often involve multiple AI agents passing data between them—like one agent checking a customer’s eligibility, then passing the case to another for approval. A simple chat history can’t track who told what to whom. It’s like trying to find a leak in a game of telephone by only listening to the last person. You miss the chain of custody.

PCAS solves this by acting as a strict security guard that checks every single action—every data access, every message between agents—against a formal rulebook. If an action breaks a policy, it’s blocked before it happens. You can read their full paper here: Policy Compiler for Secure Agentic Systems.

How to Apply This Today: 4 Steps to Policy-Proof Your AI Workflows

This isn’t just theory. You can start building this level of security into your AI agent systems this week. Here’s how.

Step 1: Map Your “Sensitive” Workflows

First, identify where policy failure would hurt the most. Don’t start with a simple FAQ bot. Start with a workflow where a mistake means real risk.

For example:

  • Customer Support: An agent that handles refunds or accesses personal account data.
  • Compliance: An agent that screens transactions for money laundering red flags.
  • HR: An agent that filters job applications based on sensitive criteria.

Action this week: List your top 3 AI automation ideas. For each, write down the one policy that, if broken, would cause the most damage (e.g., “Leaking customer SSN,” “Approving a payment without manager sign-off”).

Step 2: Formalize Your Rules (Move Beyond Prompts)

Stop writing rules as suggestions in a prompt (“Please don’t share PII”). Start writing them as formal, logical statements a computer can enforce.

Tool to use: Start with a simple policy language like Rego (used by the Open Policy Agent project) or Cedar (from AWS). These let you write rules like code.

For example, a prompt rule is: “Only users in the ‘Manager’ group can approve expenses over $1000.”

A formal policy in Cedar looks like this:
```
permit(
principal in UserGroup::"Manager",
action == Action::"approve",
resource in Expense::""
)
when {
resource.amount <= 1000
};
```
This is unambiguous. It can be checked automatically.

Action this week: Take your most critical rule from Step 1 and try to write it in a formal “if-then” logic statement.

Step 3: Integrate a Policy Enforcement Point (PEP)

This is the core technical step. You need a piece of software—a Policy Enforcement Point—that intercepts every request your AI agent makes.

How to do it:

  1. Architect your agents so all their requests for data or actions go through a central API gateway or middleware.
  2. At that gateway, before the request is fulfilled, call a Policy Decision Point (PDP). This is where your formal rules from Step 2 live.
  3. The PDP evaluates the request (“Can Agent A send Customer X’s email to Agent B?”) against all policies and returns a simple ALLOW or DENY.
  4. The gateway only proceeds if the answer is ALLOW.

For example: Your customer service AI agent tries to fetch a user’s full billing history. The PEP intercepts the call, checks the policy (“Agent_Role == ‘Tier1_Support’ can only access last 6 months of history”), and denies the request. The agent never sees the disallowed data.

Action this week: Sketch a diagram of your AI workflow. Draw a box labeled “Policy Check” between each agent and the database or service it calls.

Step 4: Audit Everything (Track the Information Flow)

Enforcement is good. Proof is better. You must log every policy decision and track how data moves.

Tool to use: Use an observability platform like Datadog, New Relic, or Grafana. Log every event: [TIMESTAMP] Agent ‘RefundBot’ request to ‘access_customer_record(12345)’ – POLICY: ‘manager_approval_required’ – RESULT: DENIED.

For example: If a customer complaint arises, you can trace the exact journey of their data. You can prove which agents accessed it, when, and that every access was policy-compliant. This is essential for regulatory audits.

Action this week: Set up a dedicated log stream or dashboard channel called “AI_Policy_Audit.” Configure your first middleware gateway to send a test log entry to it.

What to Watch Out For

This approach is powerful, but be realistic about two limits:

  1. Real-Time Policy Changes: The paper’s model assumes policies are stable during a workflow. If your business rules change minute-to-minute (e.g., dynamic fraud detection rules), you’ll need a mechanism to safely update policies for in-flight agent conversations.
  2. Scale and Speed: The research doesn’t test the system with thousands of agents making millions of requests per second. For large-scale deployments, you’ll need to performance-test your Policy Decision Point to ensure it doesn’t become a bottleneck.

Your Next Move

Start by picking one small, sensitive workflow. Don’t boil the ocean. Follow the four steps above:

  1. Map it.
  2. Write one formal rule.
  3. Build or configure a simple PEP (many API gateways like Kong or Apache APISIX have policy plugin architectures).
  4. Turn on auditing.

Prove the concept where the risk is highest and the reward for safety is greatest. Once you have one policy-protected AI agent running, scaling the model is straightforward.

What’s the first workflow you’ll secure? Share your plan in the comments below.

AI agent securitypolicy enforcement systemreduce AI compliance risksecure AI workflowsCTO AI governance

Comments

Loading...

Turn Research Into Results

At Klevox Studio, we help businesses translate cutting-edge research into real-world solutions. Whether you need AI strategy, automation, or custom software — we turn complexity into competitive advantage.

Ready to get started?