fix(tables): serialize schema mutations to prevent parallel column clobber#4812
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Introduces Also allows Reviewed by Cursor Bugbot for commit dd6b077. Bugbot is set up for automated code reviews on this repo. Configure here. |
Greptile SummaryThis PR fixes a silent column-clobber race in
Confidence Score: 5/5Safe to merge — the advisory-lock fix is architecturally sound, the critical section is kept tight, and all previously raised concerns (workflow load inside lock, idle timeout for updateColumnConstraints) are addressed. All 11 schema mutators are correctly serialized through the new advisory lock. The two pre-lock phase-1 patterns include proper staleness guards under the lock. The updateColumnConstraints idle-timeout gap from the prior review is fixed. The only observations are two pre-existing unguarded schema writers (updateTableMetadata, pruneStaleWorkflowGroupOutputs) that fall outside the PR's stated scope, plus a one-word doc comment correction. apps/sim/lib/table/service.ts — the two pre-existing unguarded schema writers (updateTableMetadata lines 713–766, pruneStaleWorkflowGroupOutputs lines 796–854) are the most natural follow-up candidates for serialization coverage. Important Files Changed
Sequence DiagramsequenceDiagram
participant C1 as Caller 1
participant C2 as Caller 2
participant WLT as withLockedTable
participant PG as PostgreSQL
C1->>WLT: addWorkflowGroupOutput (phase 1)
C2->>WLT: addWorkflowGroupOutput (phase 1)
Note over C1,C2: Both load workflow outside lock (parallel OK)
C1->>PG: loadWorkflowFromNormalizedTables
C2->>PG: loadWorkflowFromNormalizedTables
PG-->>C1: workflow graph
PG-->>C2: workflow graph
C1->>PG: BEGIN + setTableTxTimeouts + pg_advisory_xact_lock
Note over C1,PG: C1 acquires lock
C2->>PG: BEGIN + setTableTxTimeouts + pg_advisory_xact_lock
Note over C2,PG: C2 waits (lock_timeout=3s)
C1->>PG: SELECT table (fresh read under lock)
PG-->>C1: current schema
C1->>PG: UPDATE schema (new column added)
C1->>PG: COMMIT → lock released
Note over C1: Lock released
Note over C2: C2 now acquires lock
C2->>PG: SELECT table (sees C1 committed column)
PG-->>C2: fresh schema with C1 column
C2->>PG: UPDATE schema (adds its column on top)
C2->>PG: COMMIT
Reviews (5): Last reviewed commit: "fix(tables): scale idle timeout in updat..." | Re-trigger Greptile |
|
Addressed in d05348d: P1 — advisory lock held during workflow load ( P2 — Typecheck, lint, @greptile review |
… large type changes
|
Addressed in 9cbcfcc: Idle timeout aborts type change ( Typecheck, lint, @greptile review |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9cbcfcc. Configure here.
|
Addressed in dd6b077: Stale remap types after workflow change ( Typecheck, lint, @greptile review |
|
Addressed in 704944b:
Typecheck, lint, @greptile review |

Summary
lib/table/service.ts(add/rename/delete column, change type/constraints, add/update/delete workflow group + outputs) did an unguarded read-modify-write of theschemaJSONB — read outside any lock, compute, blind-write the whole schema. Parallel calls (e.g. Mothership fanning out threeadd_workflow_group_outputcalls for separate columns) all read the same baseline and last-writer-wins, so columns silently vanished while every call reported success.withLockedTablehelper that takes a transaction-scoped advisory lock keyed ontableId, then re-reads the table inside the lock so each serialized writer computes against the prior writer's committed columns. Wired all 11 mutators through it.SELECT ... FOR UPDATEon the definition row) so it adds no edges to the row-lock graph — aFOR UPDATEwould invert lock order against the row-count trigger and risk deadlocks on the row-write path. Bumpedidle_in_transaction_session_timeoutfor the two paths that load a workflow inside the lock.Type of Change
Testing
bun run lintclean,bun run check:api-validation:strictpasseduser_tabletool tests passChecklist