Per-file batch fixing for --fix option#5550
Conversation
|
You've opened the pull request against the latest branch 2.2.x. PHPStan 2.2 is not going to be released for months. If your code is relevant on 2.1.x and you want it to be released sooner, please rebase your pull request and change its target to 2.1.x. |
70b6e85 to
3172df5
Compare
3172df5 to
5f1db78
Compare
|
@phpstan-bot switched to 2.1.x 😘 |
df4076b to
865f049
Compare
ondrejmirtes
left a comment
There was a problem hiding this comment.
- How do you handle traits? Those files might be "fixed" from different usages at different points during the analysis.
- I'm torn about a problem I didn't realize. Let's say we gave two errors, both fixable, one ignored and one not. Running
--fixwill fix them both, right? We probably can't solve it becaus we need the one diff per file for performance, and at this point we don't (and should not) be aware what errors are ignored and not.
|
Also, probably, depending on the answer to the previous one, I would not attach the diff to an Error anymore. With one diff per file, it doesn't make sense. So we need a new field for this in the result cache, merging logic in the ResultCacheManager too. |
|
At the same time it'd also be useful to track how many errors applying a fix fixes... |
d0ddada to
3fe8754
Compare
|
Good points. Traits
It's the safest solution, I doubt anything more complicated would be even useful. Ignored errors fixer awareness
It is again an "if in doubt, don't fix it" approach. Fixes are applied only in cases we are certain about. Semantically, the approach expects baseline errors to be fixed in future while the inline ignores are intentional and will stay there. Fixes attached to a fileDone |
|
This is suddenly a lot of lines of code to review and I'm a bit skeptical. So I'm going to ask questions instead. So the aim was to have one diff per file to be able to apply it effectively. But that means we suddenly have fixes for both ignored and unignored errors, meaning running What I worry about most is that How is this information now stored in the result cache? |
Each file's fixable errors are now applied in a single AST traversal that produces one diff per file, replacing the previous one-diff-per-error pipeline merged via PhpMerge.
Why
Previously, every
RuleErrorBuilder::fixNode()call produced its own unified diff.Patcher::applyDiffs()then merged N diffs per file usingPhpMerge::mergeHunks(), looping with a full-fileDiffer::diffToArray()per hunk. Cost grows quadratically with hunk count.How
BatchReplacingNodeVisitormatches all of a file's pending fixes viaorigNodeattribute in a singleNodeTraverserpass.$scope->getType()works inside callables) and the result node is memoized.Patcher::applyDiffs()is unchanged in shape but only ever sees one diff per file in the new path.Benchmark
Via new PatcherBenchmarkTest, executed on AMD 9950X3D
Real-world test
1977-file project, using two custom fixable rules with 376 violations across 145 files, some of them with 2-4k LoC. Without cache, analysis ran for 53.1s without measurable change from the previous state. Fix ran for 62.3s adding 9.2s on top of analysis.