Drop merge=union for sweep state CSVs (#2754)#3043
Merged
Conversation
.gitattributes registered .claude/sweep-*-state.csv and .codex/sweep-*-state.csv with merge=union. These files are keyed by module (one row per module) with a single header, not an append-only log. The union driver concatenates both sides of a conflicting hunk instead of raising a conflict. When a branch was based on a stale copy, the merge silently produced duplicate header lines and duplicate module rows. It happened on main with the #2712 squash merge: the file went from 9 lines to 43. The earlier single-line-record fix (#2680/#2684) kept embedded newlines from splitting a record, but union still duplicated whole rows and still mixed formats across stale branches. Switch both paths to merge=text (git's built-in 3-way text driver) so a real divergence surfaces as a normal conflict to resolve by hand rather than corrupting the file silently. I considered -merge (always-conflict, take-one-side) but rejected it: in a git sandbox it silently dropped the other branch's rows, trading one silent-corruption mode for a silent-data-loss mode. Also updates the merge-driver references across the sweep command docs to match, and adds a git-sandbox regression test that reproduces the stale-branch merge and asserts it conflicts instead of unioning. The single-line-record writer stays (clean diffs, predictable conflicts); only its rationale comment changed to drop the removed driver name.
brendancol
commented
Jun 8, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Drop merge=union for sweep state CSVs (#2754)
This is a repo-tooling / git-attributes change. There's no xrspatial runtime code in the diff, so the geospatial-correctness, backend-parity, dask, and cupy sections of the checklist don't apply. I focused on whether the fix is correct and whether the regression test actually guards it.
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
xrspatial/tests/test_sweep_state_csv_merge_2754.py:131: the assertionmerged.count("slope,") <= 2does not distinguish a conflict from a union. I checked it: under the old merge=union driver the merged file also contains exactly twoslope,lines (slope,2026-06-01andslope,2026-01-01), so the count is 2 in both the broken and the fixed case. The real guards are thereturncode != 0check and the conflict-marker check, which are correct. Either drop the slope-count assertion or make it catch the union shape (for example, assert exactly one header line, since union duplicates the header). As written it implies protection it doesn't provide.
Nits (optional improvements)
xrspatial/tests/test_sweep_state_csv_merge_2754.py:84:test_gitattributes_sweep_csv_not_union(tmp_path)acceptstmp_pathbut never uses it. Drop the parameter.- The test shells out to the system git and reads the repo's real
.gitattributes. If a contributor's global git config defined a custom driver literally namedtext,merge=textwould resolve to it instead of the built-in. Remote edge case, not worth guarding, but a one-line comment notingtextis git's built-in driver name would help.
What looks good
- merge=text is the right call. The PR body documents that -merge was tested and rejected because it silently dropped a side's rows, which is the correct reason to avoid it.
- The doc updates are thorough: every stale "auto-merge without conflict" claim across the seven sweep files and deep-sweep was corrected, and the single-line
_onelinewriter was kept (it's still useful) with only its comment updated. - Red/green was verified both ways (fails under union, passes under text), which is what makes this test worth having.
Checklist
- Fix matches the issue's Option 1
- Regression test reproduces the reported scenario
- Test verified red under the old behavior
- [n/a] Backend parity / dask / cupy / NaN handling (no runtime code)
- [n/a] README feature matrix (no new function)
- Docs updated to match the new merge behavior
The slope-row count assertion did not distinguish a conflict from a union (both leave two slope, lines), so it implied protection it did not provide. Replace it with a header-count assertion: union duplicates the header line, a conflict keeps one. Drop the unused tmp_path parameter from the gitattributes check, and note in _make_repo that merge=text is git's built-in driver rather than a custom one.
brendancol
commented
Jun 8, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review (after 3617d3a)
All three findings from the previous pass are addressed. No new findings.
- Suggestion (weak slope-count assertion): fixed. The assertion now checks that the merged file has exactly one header line. Union duplicates the header, a conflict keeps one, so this actually distinguishes the two cases. Re-verified red under merge=union and green under merge=text.
- Nit (unused
tmp_path): fixed. Removed fromtest_gitattributes_sweep_csv_not_union. - Nit (custom-driver edge case): addressed with a comment in
_make_reponotingmerge=textis git's built-in driver and the sandbox defines no custom driver of that name.
Nothing else outstanding. Only the dismissed/addressed items remain.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.gitattributesregistered the sweep state CSVs (.claude/sweep-*-state.csv,.codex/sweep-*-state.csv) withmerge=union. These files are keyed by module (one row per module, single header), not an append-only log. The union driver concatenates both sides of a conflicting hunk instead of conflicting, so a branch based on a stale copy merges into duplicate header lines and duplicate module rows. It happened onmainwith the #2712 squash merge (9 lines to 43). The earlier single-line-record fix (#2680/#2684) stopped embedded newlines from splitting records, but union still duplicated whole rows and still mixed formats across stale branches.This is Option 1 from the issue: drop
merge=union.Changes
.gitattributes: switch both sweep-CSV globs tomerge=text(git's built-in 3-way text driver), so a real divergence surfaces as a normal conflict to resolve by hand instead of silently unioning._onelinewriter stays; only its rationale comment dropped the removed driver name.xrspatial/tests/test_sweep_state_csv_merge_2754.py: a git-sandbox regression test that copies the repo.gitattributes, reproduces the stale-branch merge from the issue, and asserts the merge conflicts (with markers, no duplicated rows) rather than unioning. Verified red undermerge=unionand green undermerge=text.Tradeoff considered
I tested
-merge(always conflict, take one side) as an alternative. In the same sandbox it silently dropped the other branch's rows, trading silent corruption for silent data loss.merge=textkeeps the divergence visible as a real conflict, which is the point.Backend coverage
Not applicable. This is a repo-tooling / git-attributes change with no xrspatial runtime surface.
Test plan
pytest xrspatial/tests/test_sweep_state_csv_merge_2754.pypassesmerge=unionis restored (red/green)Closes #2754