Skip to content

add base_layer suffix for expert weights#159

Merged
tastelikefeet merged 1 commit intomodelscope:mainfrom
hjh0119:expert-lora
Apr 15, 2026
Merged

add base_layer suffix for expert weights#159
tastelikefeet merged 1 commit intomodelscope:mainfrom
hjh0119:expert-lora

Conversation

@hjh0119
Copy link
Copy Markdown
Collaborator

@hjh0119 hjh0119 commented Apr 15, 2026

No description provided.

@tastelikefeet tastelikefeet merged commit 3662310 into modelscope:main Apr 15, 2026
1 of 3 checks passed
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the _add_base_layer_suffix function in megatron.py to handle expert-related parameters. A critical issue was identified where the code could trigger an UnboundLocalError because the variable base_layer_name might be undefined if a parameter name contains 'experts' but does not end with the expected '.weight' or '.bias' suffixes.

Comment on lines +1435 to +1436
if 'experts' in name:
return base_layer_name
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This logic introduces a potential UnboundLocalError. The variable base_layer_name is only defined within the if name.endswith('.weight') or elif name.endswith('.bias') blocks. If name contains 'experts' but does not end with either suffix (e.g., a buffer or a router parameter), the code will crash when attempting to return base_layer_name.

Furthermore, this change bypasses the model_keys validation for expert weights. If this bypass is intentional (to force the suffix for experts regardless of whether the key was found in model_keys), you should still ensure the variable is defined before returning it.

Suggested change
if 'experts' in name:
return base_layer_name
if 'experts' in name and (name.endswith('.weight') or name.endswith('.bias')):
return base_layer_name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants