AI Wrote the Code. Who's Responsible When It Breaks?
Your engineering team is using AI to write code faster. But you're probably asking: "How do I manage this?"
How do you review AI-generated code? Who is accountable if it introduces a bug or a security hole? Right now, most teams are winging it. This creates risk.
New research gives us a clear, evidence-based playbook. It shows that AI tools fall into two predictable categories. More importantly, it shows where your human oversight must be bulletproof.
What Researchers Discovered
Researchers from the University of Calgary analyzed thousands of AI-assisted coding tasks. They published their findings in Collaborator or Assistant? How AI Coding Agents Partition Work Across Pull Request Lifecycles.
They found two distinct types of AI tools.
1. The Proactive Collaborator. Tools like GitHub Copilot, Cursor, and Devin fall here. They act like an eager junior developer. They proactively start coding tasks 96% of the time. They see a TODO comment or a failing test and jump in to fix it.
2. The Reactive Assistant. Tools like directly prompting ChatGPT or Claude are assistants. They wait for your explicit direction 95% of the time. You give them a task, and they help you complete it.
The critical finding? Humans have the final say 99.9% of the time. An AI can write the entire feature, but a person must almost always approve the merge. This creates a "junior teammate" dynamic.
But there's a hidden risk. The study found that code from "Assistant" tools bypasses formal review 76% of the time. It goes straight to approval. Code from "Collaborator" tools goes through a formal human review phase 90% of the time.
Why does this matter? Tools that feel more "helpful" might lead to less oversight. Your team's trust could create a governance gap.
The biggest red flag? Audit trails are broken. When an automated system merges code, logs show who clicked "merge" but not who made the decision to approve it. If that code causes an outage, you might not be able to trace responsibility.
How to Apply This Today: A 3-Step Action Plan
You can't manage what you don't measure. Use this research to build a deliberate process. Do this within the next quarter.
Step 1: Classify Your Tools and Set Workflow Rules
First, label every AI tool your team uses.
- Is it a Collaborator? (Initiates work, suggests completions)
- Is it an Assistant? (Requires explicit prompts)
This classification dictates your mandatory review rules.
For Collaborator Tools (e.g., Copilot, Cursor):
- Mandate a formal code review for every change. No exceptions. Treat the AI's output as you would a junior developer's first draft.
- Require a human-authored summary. The reviewer must write a comment explaining what was changed and why it was approved.
For Assistant Tools (e.g., ChatGPT, Claude):
- Require prompt documentation. Any code block pasted from a chat must include the original prompt as a comment. This provides context for reviewers.
- Institute a "second pair of eyes" rule. Code from these tools cannot be self-reviewed. Another team member must approve it.
Step 2: Fix Your Audit Trail
The "who approved this?" log must be clear. Update your DevOps policy this month.
- Disallow anonymous automated merges. Configure your CI/CD system (like GitHub Actions or GitLab CI) so that every merge requires a human account to trigger it, even if it's just clicking "Run."
- Enrich your commit messages. Add a mandatory tag. For example:
-
[AI-Assisted: Copilot]- For completions and suggestions. -
[AI-Generated: ChatGPT]- For code blocks from a chat session. This creates a searchable history of AI involvement.
-
- Log the decision-maker. In your project management tool (Jira, Linear), require that the ticket linked to the pull request names the human who approved the AI's work for merge.
Step 3: Train Your Team on the "Why"
Processes fail without understanding. Hold a 30-minute team meeting to align on the new rules.
- Frame it as risk management, not mistrust. Say: "AI makes us faster. These rules ensure that speed doesn't create future bugs or security debt."
- Show the data. Share the finding that "Assistant" tools often skip review. Explain that your new rules are a proactive guardrail.
- Assign a "Process Owner" for 3 months. This person answers questions and ensures the new steps are followed, adjusting them if they're too cumbersome.
What to Watch Out For
This approach is practical, but it has limits.
- It Adds Friction. Mandatory reviews and documentation slow things down. This is the trade-off for safety. The goal is managed velocity, not unchecked speed.
- It Doesn't Assess Quality. This process manages how code is approved, not if the code is good. You still need robust testing, security scanning, and skilled reviewers.
- Tools Will Evolve. The line between "Collaborator" and "Assistant" may blur. Re-evaluate your classifications every six months.
Your Next Move
Start by classifying your tools.
This week, list the AI coding tools your team uses. Put them in two columns: Collaborators and Assistants. Send that list to your team and announce: "Starting next Monday, all code from Column A needs a formal review. Code from Column B needs a documented prompt."
This single action moves you from ad-hoc to intentional. It builds the foundation for scaling AI use without scaling risk.
What's the first tool you'll classify, and what rule will you set for it? Share your plan in the comments.
Comments
Loading...




