BLOG · BUILDING VIBEDRIFT

Your AI-written codebase is drifting. Here's how to measure it.

April 14, 2026 · ~8 min read

The bug you can't grep for

I was going through my handlers last month. All written with Claude and Cursor over a few weeks. Something felt wrong but I couldn't name it.

Then I looked closer.

userHandler.ts called requireAuth(req) on line 3, validated input against a schema, and threw a typed NotFoundError if the record was missing. Clean. Intentional. Consistent with every other handler in the project.

orderHandler.ts, written in a different session a week later, had no auth check. No input validation. And instead of throwing a typed error, it returned { status: 404, error: 'not found' } as a plain object.

Both compiled. Both passed tests. Both passed every linter. But one handler followed three behavioral conventions that the other completely ignored. Not because I changed my mind. Because the AI started a fresh session and made different decisions about how the application should behave.

That is drift.

Drift is not what you think it is

Most people hear "code drift" and think duplicated functions or inconsistent naming. Those are symptoms. Drift is something deeper.

Drift is the behavioral deviation between what your codebase intends to do and what the AI actually introduced. It is the delta between the established workflow of your application and the assumptions the AI made in a session where it had no memory of that workflow.

When a human developer joins a team, they read the existing code, absorb the patterns, and follow them. When an AI coding tool starts a new session, it has zero memory of what was established before. It doesn't know your project uses typed errors. It doesn't know every handler validates input. It doesn't know auth is mandatory on every route. So it makes reasonable but different choices. And those choices quietly contradict the behavioral contract your codebase has been building.

This shows up as:

None of these are syntax errors. None are bugs in the traditional sense. The application works. But its behavior is no longer internally coherent. The codebase has stopped agreeing with itself about how things should work.

Why nothing catches this today

Linters check syntax against predefined rules. They don't know what your codebase's behavioral patterns are.

PR review bots analyze diffs. They see what changed in a single commit. They don't compare a new file against the 50 files that came before it.

Complexity analyzers count branches and nesting. They measure how complicated code is, not whether it contradicts its neighbors.

All of these tools evaluate files in isolation. Not one of them asks the question that actually matters: does this file's behavior contradict the behavioral contract established by the rest of the project?

That is the gap.

VibeDrift measures drift

VibeDrift reads your entire project, builds a behavioral profile of your codebase, identifies the dominant patterns and workflows, and measures the deviation of every file from that established intent.

It doesn't enforce external rules. It discovers the rules your code already follows and finds where those rules break.

Five detectors analyze five dimensions of behavioral consistency:

  1. Architectural consistency — are files solving the same category of problem in the same way?
  2. Security posture — is auth, validation, and rate limiting applied uniformly?
  3. Redundancy — are there hallucinated workflows, phantom scaffolding, or duplicate logic the AI generated without knowing it already existed?
  4. Convention adherence — do naming, imports, error shapes, and async patterns stay consistent?
  5. Scaffolding hygiene — is there generated code that exists but serves no purpose?

The output is a composite score from 0 to 100 representing how behaviorally coherent your codebase is. Not how "clean" it is. Not how "complex" it is. How much it agrees with itself.

Every finding shows the dominant behavior, the deviating files, and a targeted fix.

30 seconds to your score

npx @vibedrift/cli .

No install. No signup. No config. Runs locally, nothing leaves your machine.

Scanning 55 files · 6,151 LOC · TypeScript
✓ Static analysis .............. 0.8s
✓ Cross-file drift ............. 0.4s
✓ Code DNA ..................... 0.03s

58/100 · Grade D · 7 findings
Report: ./vibedrift-report.html

The report gives you your score, a breakdown across all five categories, and every finding with the file, the line, the dominant pattern, and what to change.

Deep scan — AI that understands behavior

The free scan uses static analysis and structural fingerprinting to catch drift that's visible in code patterns. It catches roughly 70% of issues.

The deep scan adds AI that understands what your code actually does, not just how it looks:

vibedrift . --deep

Function snippets only are sent for analysis. Never full files. Never git history. Processed in memory, never stored.

Every account gets 1 free deep scan on signup.

What I found in my own code

ProjectFilesScoreGradeTop Issue
acme-api5542D3 competing data access patterns
mixstream-web3458Dauth missing on 2 admin routes
vibelang-stdlib4468Challucinated CRUD on 3 endpoints

Every project worked. Tests passed. Users were fine. But under the surface, the AI had been making contradictory behavioral decisions for weeks without anyone noticing.

CI/CD — catch drift before it merges

name: VibeDrift
on: [pull_request]

jobs:
  drift-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npx @vibedrift/cli . --json --fail-on-score 70
        env:
          VIBEDRIFT_TOKEN: ${{ secrets.VIBEDRIFT_TOKEN }}

--fail-on-score 70 blocks the merge if behavioral coherence drops below your threshold.

The bigger problem

VibeDrift measures drift that already exists. But the deeper question is: why can't code express intent in the first place?

A CLAUDE.md or .cursorrules file can say "use repository pattern." But those are guidelines, not enforcement. Across teams, tools, and months of development, they erode.

That's why I'm also building VibeLang — a language where behavioral intent is a compiler-enforced construct. The AI can't deviate because the language won't compile code that contradicts the declared architecture.

VibeDrift diagnoses. VibeLang prevents. But that's a story for another post.

Run it

npx @vibedrift/cli .

Website · npm

Drop your score in the comments. If VibeDrift flags something it shouldn't, tell me that too. It makes the tool better.


Free to scan locally. Free tier includes 3 deep scans/month. Pro $15/mo (50 deep scans), Scale $30/mo (100 deep scans), $1/scan overage on any paid tier.