Skip to content

Fix deadlock in cmdio tea program queueing#5505

Merged
simonfaltum merged 2 commits into
mainfrom
simonfaltum/b10-cmdio-tea-deadlock
Jun 10, 2026
Merged

Fix deadlock in cmdio tea program queueing#5505
simonfaltum merged 2 commits into
mainfrom
simonfaltum/b10-cmdio-tea-deadlock

Conversation

@simonfaltum

Copy link
Copy Markdown
Member

Why

Found during a full-repo review of the CLI. acquireTeaProgram in libs/cmdio is documented to queue a new tea.Program behind an active one, but it waits on the teaDone channel while holding teaMu. releaseTeaProgram needs that same mutex to close the channel, so the moment two programs overlap, both goroutines deadlock and the CLI hangs. It is latent today because callers happen to serialize, but any future concurrent spinner or prompt would trip it.

Changes

Before, starting a second tea.Program while one was active deadlocked; now the second one waits for the active program to finish and then runs. acquireTeaProgram copies teaDone under the lock, releases the lock while waiting on the channel, and re-checks in a loop before registering, since another acquirer may take the slot first. releaseTeaProgram and Wait are unchanged.

Test plan

  • Added TestAcquireTeaProgramWaitsForRelease in libs/cmdio/io_test.go: starts a second acquire while a program is active, asserts it queues, then asserts it completes after release, with a timeout guarding against deadlock
  • Verified the new test fails against the previous implementation (times out at the release step)
  • go test -race ./libs/cmdio/... passes
  • ./task fmt-q, ./task lint-q, and ./task checks pass

This pull request and its description were written by Isaac.

acquireTeaProgram blocked on teaDone while holding teaMu, but releaseTeaProgram needs teaMu to close that channel, so queueing a second tea.Program behind an active one deadlocked both goroutines. Wait with the lock released and re-check in a loop before registering.

Co-authored-by: Isaac
@simonfaltum simonfaltum requested a review from pietern June 9, 2026 20:22
@simonfaltum simonfaltum temporarily deployed to test-trigger-is June 9, 2026 20:22 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is June 9, 2026 20:22 — with GitHub Actions Inactive
@eng-dev-ecosystem-bot

eng-dev-ecosystem-bot commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Commit: 4d5c4c4

Run: 27239645533

Env 🟨​KNOWN 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 7 15 261 928 7:39
💚​ aws windows 7 15 263 926 8:59
💚​ aws-ucws linux 7 15 357 842 8:25
💚​ aws-ucws windows 7 15 359 840 11:00
💚​ azure linux 1 17 264 926 6:22
💚​ azure windows 1 17 266 924 10:05
💚​ azure-ucws linux 1 17 362 838 8:04
💚​ azure-ucws windows 1 17 364 836 11:41
💚​ gcp linux 1 17 260 929 8:30
💚​ gcp windows 1 17 262 927 13:12
22 interesting tests: 15 SKIP, 7 KNOWN
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/invariant/no_drift 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 💚​R 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 💚​R 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 💚​R 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/replace_existing 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/grants/select 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/ssh/connection 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
Top 29 slowest tests (at least 2 minutes):
duration env testname
5:59 gcp windows TestAccept
5:38 azure-ucws windows TestAccept
5:05 azure windows TestAccept
4:59 aws-ucws windows TestAccept
4:58 aws windows TestAccept
4:51 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:31 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:19 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:16 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:06 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:43 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:40 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:22 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:14 azure-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:59 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:58 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:58 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:58 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:55 azure linux TestAccept
2:51 gcp linux TestAccept
2:51 aws-ucws linux TestAccept
2:49 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:49 azure-ucws linux TestAccept
2:46 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:37 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:34 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:26 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:26 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:25 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct

Co-authored-by: Isaac

@pietern pietern left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to get rid of this when we can and allocate a single tea.Program for the lifetime of the command instead of swapping in and out. Needs a close look.

@simonfaltum simonfaltum added this pull request to the merge queue Jun 10, 2026
Merged via the queue into main with commit 3795843 Jun 10, 2026
25 checks passed
@simonfaltum simonfaltum deleted the simonfaltum/b10-cmdio-tea-deadlock branch June 10, 2026 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants