Skip to content

Refactor CK workspace memory management to use a unified toggle-pass helper object#552

Merged
Micky774 merged 2 commits into
devfrom
zain/ck-workspace-refactor
Apr 23, 2026
Merged

Refactor CK workspace memory management to use a unified toggle-pass helper object#552
Micky774 merged 2 commits into
devfrom
zain/ck-workspace-refactor

Conversation

@Micky774
Copy link
Copy Markdown
Contributor

Description

Replaced the mirrored two-pass workspace pattern (if(workspace==nullptr) block computing size + a separate execution block doing the same allocation arithmetic with workspace_next) with a single-pass WorkspacePlanner that unifies both passes. Eliminates a class of drift bugs where the size pass and use pass could disagree.

Fixes # (issue)

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Added WorkspacePlanner class: constructor takes void* base; allocate(bytes) returns nullptr in sizing mode while still accumulating total()
  • Added byte_offset(ptr, n) and byte_diff(a, b) helpers for nullptr-safe sub-buffer carving (qkv-packed cases)
  • fused_attn_ck_fwd_impl: deleted the sizing-only if block; replaced all workspace_next arithmetic with planner.allocate(); moved init kernels (alibi gen, generate_cu_seqlen_padded, memset of O) past a single if(planner.is_sizing()) return early-return
  • fused_attn_ck_bwd_impl: same treatment plus hoisted nvte_get_qkv_layout_group(layout) to the function top (the original recomputed it inside the SBHD+padding branch); user-output dq/dk/dv memsets and workspace dq_acc/dbias/dqkv-without-padding memsets all moved past the sizing return
  • lse_workspace = workspace (which silently relied on workspace_next == workspace at that point) became an explicit planner.allocate(...) as the first allocation

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@Micky774 Micky774 added the ci-level 3 CI test level 3 label Apr 20, 2026
@Micky774 Micky774 changed the title Refactored to a unified toggle-pass workspace manager Refactor CK workspace memory management to use a unified toggle-pass helper object Apr 20, 2026
Comment on lines -593 to +629
size_t q_storage_bytes = max_tokens_q*h*d_qk*nvte_dtype_size(dtype);
size_t k_storage_bytes = max_tokens_kv*hg*d_qk*nvte_dtype_size(dtype);
size_t v_storage_bytes = max_tokens_kv*hg*d_v*nvte_dtype_size(dtype);
size_t o_storage_bytes = max_tokens_q*h*d_v*nvte_dtype_size(dtype);
size_t q_storage_bytes = max_tokens_q*h*d_qk*nvte_dtype_size(dtype);
size_t k_storage_bytes = max_tokens_kv*hg*d_qk*nvte_dtype_size(dtype);
size_t v_storage_bytes = max_tokens_kv*hg*d_v*nvte_dtype_size(dtype);
size_t o_storage_bytes = max_tokens_q*h*d_v*nvte_dtype_size(dtype);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, what's changed here? I stare at it for half a minute but still not seeing any diff :-)

Copy link
Copy Markdown
Contributor Author

@Micky774 Micky774 Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whitespace removal at the end of the line -- you can view diffs in the GH PR UI specifically excluding white space changes which I find quite helpful btw.

@Micky774 Micky774 merged commit b950cbf into dev Apr 23, 2026
6 of 9 checks passed
@Micky774 Micky774 deleted the zain/ck-workspace-refactor branch April 23, 2026 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-level 3 CI test level 3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants