Skip to content

Integrate initial version of QoLA#550

Merged
Micky774 merged 10 commits into
devfrom
zain/qola
Apr 24, 2026
Merged

Integrate initial version of QoLA#550
Micky774 merged 10 commits into
devfrom
zain/qola

Conversation

@Micky774
Copy link
Copy Markdown
Contributor

Description

This PR integrates QoLA as our main mechanism of building AITER, slotting it into existing infrastructure such as the aiter-prebuilt cache system.

Fixes # (issue)

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Introduces QoLA submodule
  • Updates cmake files to include QoLA and use namespace-prefixed library names
  • Removes AITER_ASM_DIR machinery in TE
  • Exchanges aiter:: namespace for a macro-driven QOLA_NS(*) wrapper

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@Micky774 Micky774 added the ci-level 1 CI test level 1 label Apr 20, 2026
Comment thread 3rdparty/QoLA
@@ -56,24 +56,39 @@ list(JOIN V3_ASM_ARCHS ";" V3_ASM_ARCHS_STR)

if(DEFINED AITER_MHA_PATH)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we tested all possible paths (like with the aiter prebuilt env, NVTE_FUSED_ATTN_CK_PATH, and so on)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No not completely. I tested locally with a source build, but still need to test w/ the pre-built. The NVTE_FUSED_ATTN_CK_PATH path is tested on subsequent re-builds where we can skip the initial qola build, so that's tested too indirectly.

I did just push some changes to QoLA and the TE integration, so I'll re-test via source build and then run an aiter-prebuilt flow for it

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do a thorough tests for those flows since ci will only test the default flow. Thanks

Comment thread transformer_engine/common/ck_fused_attn/CMakeLists.txt
Comment thread transformer_engine/common/ck_fused_attn/CMakeLists.txt
Comment thread .gitmodules
@Micky774
Copy link
Copy Markdown
Contributor Author

Tracking AITER pre-built upload here and will run local tests on the pre-built path when finished.

Copy link
Copy Markdown
Collaborator

@ipanfilo ipanfilo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With QoLA AITER, 3rdparty/aiter has to be removed unless we plan to pass AITER location instead of AITER commit to QoLA

--install-dir ${__AITER_MHA_PATH}
--gpu-archs "${V3_ASM_ARCHS_STR}"
--ck-tile-bf16 ${CK_FUSED_ATTN_FLOAT_TO_BFLOAT16_DEFAULT}
COMMAND ${CMAKE_COMMAND} -E env "PYTHONPATH=${__QOLA_DIR}:$ENV{PYTHONPATH}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it rather include it as subproject? COMMAND was used for aiter_build because it is shell script

target_link_directories(ck_fused_attn PUBLIC ${__AITER_MHA_PATH})
set(__QOLA_TE_NAMESPACE_PREFIX "te_")
set(__QOLA_MHA_FWD_LIB "${__QOLA_TE_NAMESPACE_PREFIX}libmha_fwd.so")
set(__QOLA_MHA_BWD_LIB "${__QOLA_TE_NAMESPACE_PREFIX}libmha_bwd.so")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other places use hardcoded te_libmha_* so let;s be consistent and do the same here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangye805 I'm inclined to agree with Ilya here unless we expect or intend to support a pattern of consumers utilizing multiple conflicting versions of TE simultaneously.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm okay with both

Comment thread transformer_engine/common/ck_fused_attn/qola_manifest.toml Outdated
@Micky774
Copy link
Copy Markdown
Contributor Author

Failure seems unrelated btw

@Micky774 Micky774 requested a review from ipanfilo April 22, 2026 13:13
@Micky774
Copy link
Copy Markdown
Contributor Author

Micky774 commented Apr 23, 2026

Current CI on this PR is source-build. CI run w/ pre-built download: https://github.com/ROCm/TransformerEngine/actions/runs/24856277623

@Micky774 Micky774 merged commit 291d21e into dev Apr 24, 2026
10 of 17 checks passed
@Micky774 Micky774 deleted the zain/qola branch April 24, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-level 1 CI test level 1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants