AI & Workflows

Using LLMs to Refactor Legacy PHP Codebases Safely

March 3, 2026·9 min read·By Abimael Espinoza

LLMs are the best refactoring tool ever invented for PHP — and the easiest way to silently introduce bugs at scale. Here's the workflow I use to get the productivity gain without the regressions.

The setup: guardrails first

Before any LLM-driven refactor, you need three things in place. Without them, you're gambling.

PHPStan or Psalm at level 5+ on the module being refactored.
PHPUnit tests covering the public behavior of that module — at minimum the happy path and 2–3 edge cases.
Version control + small commits — every refactor step should be revertible in one click.

Workflow: the 4-step refactor loop

1. Pin the behavior with characterization tests

Ask the LLM to write tests describing what the legacy code currently does — bugs and all. These tests don't validate correctness; they validate that your refactor doesn't change behavior.

2. Refactor in narrow scopes

One file, one class, one method at a time. Long prompts produce long mistakes. Good first targets: extract a service from a fat controller, replace inline SQL with an Eloquent query, convert a switch to a strategy pattern.

3. Run the full guardrail suite after every change

Static analysis (PHPStan / Psalm).
Tests (PHPUnit).
Linter (PHP-CS-Fixer).
Composer autoloader regeneration if you moved files.

4. Human review for intent, not syntax

The LLM handles syntax. You handle: 'does this still match what the business needs?' and 'will the next engineer understand this?'.

Prompt patterns that work

Provide the existing code + the test that must keep passing + your target pattern. 'Refactor X to pattern Y while keeping test Z green.'
Constrain the output: 'no new dependencies, PHP 8.2 syntax, PSR-12 style.'
Ask for the refactor plan first; review; then execute.
After each change, ask: 'what did you change and why?'

What to never let the LLM do unsupervised

Database schema migrations.
Authentication / authorization logic.
Money-handling code.
Cryptography.
Anything in a try/catch where the catch is silently swallowing errors.

A real example

Last quarter I migrated a 14k-line legacy PHP 7.2 codebase to PHP 8.3 with this workflow. AI did about 70% of the syntax-level work via Rector + LLM-driven file-by-file passes. Humans reviewed every PR and made architectural decisions. Total time: ~3 weeks vs. an estimated 8 weeks without AI assistance, with fewer regression bugs than a comparable 100% manual project.

When AI refactoring fails

When tests don't exist — the LLM has nothing to anchor against.
When the codebase has unique conventions the model wasn't trained on.
When 'refactor' really means 'redesign' — that still requires a human architect.

Need a hand?

Hiring or modernizing PHP? Let's talk.

16+ years building, scaling, and rescuing PHP applications. Direct contact, no marketplace, US time zones from LATAM.

Book a free intro call Modernize legacy PHP with AI AI-assisted code modernization