Skip to content

fix(workflow-compiling-service): LogicalLink OperatorIdentity round-trip#5729

Open
wwong0 wants to merge 4 commits into
apache:mainfrom
wwong0:fix/5042-logicallink-operatoridentity-round-trip
Open

fix(workflow-compiling-service): LogicalLink OperatorIdentity round-trip#5729
wwong0 wants to merge 4 commits into
apache:mainfrom
wwong0:fix/5042-logicallink-operatoridentity-round-trip

Conversation

@wwong0

@wwong0 wwong0 commented Jun 16, 2026

Copy link
Copy Markdown

What changes were proposed in this PR?

Bug fix: LogicalLink @JsonCreator constructor (amber and workflow-compiling-service)

@JsonCreator was previously placed on the String convenience constructor of LogicalLink in both modules. OperatorIdentity is a case class that Jackson serializes as an object ({"id":"op-A"}), not a plain string. When reading back a serialized LogicalLink, Jackson dispatched to the @JsonCreator String constructor but could not coerce the {"id":"op-A"} object node to a String, causing any writeValueAsStringreadValue round-trip to fail with MismatchedInputException.

The fix introduces a private readOperatorIdentity(node: JsonNode, fieldName: String) helper in the companion object and moves @JsonCreator to a new JsonNode constructor that delegates to it. The helper accepts both the plain-string shape (front-end input) and the object shape (serialized form), maps null/absent ids to OperatorIdentity(null), and rejects malformed nodes. The String convenience constructor is retained but no longer carries @JsonCreator.

Any related issues, documentation, discussions?

Closes #5042
This PR continues and adopts work from #5175

How was this PR tested?

The new workflow-compiling-service LogicalLinkSpec adds unit coverage of LogicalLink and readOperatorIdentity model-level logic that existing tests do not cover and pins the leniency contract (no require guards in the compiler-service variant).

The existing amber LogicalLinkSpec was updated to match the renamed constructor section, drop the now-invalid MismatchedInputException expectation, and add a passing round-trip test.

Run with:

sbt "WorkflowExecutionService/testOnly *LogicalLinkSpec"

Result: 18/18 tests pass

sbt "WorkflowCompilingService/testOnly *LogicalLinkSpec"

Result: 15/15 tests pass

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6

…ound-trip

Prior to this change, `@JsonCreator` was placed on the `String`
convenience constructor in both `amber` and `workflow-compiling-service`
`LogicalLink`. Jackson serialises `OperatorIdentity` as an object
(`{"id":"op-A"}`), so any round-trip through `writeValueAsString` /
`readValue` failed with `MismatchedInputException` because the String
constructor cannot accept an object node.

Fix: add a private `readOperatorIdentity(JsonNode, String)` helper to
the companion object and move `@JsonCreator` to a new `JsonNode`
constructor that delegates to it. The helper handles the string shape
(front-end input), the object shape (serialised form), null/absent
fields, and rejects non-text / non-object nodes with
`IllegalArgumentException`.

The new `workflow-compiling-service` `LogicalLinkSpec` adds direct unit
coverage of the `readOperatorIdentity` branches that the existing
HTTP-layer-only `WorkflowCompilationResourceSpec` did not reach. It also
pins the leniency contract (no `require` guards) and the Jackson
annotation contract (`@JsonProperty` key pinning, round-trip fidelity).
The `amber` `LogicalLinkSpec` is updated to match the renamed constructor
section, drop the now-invalid MismatchedInputException expectation, and add
the round-trip test.

Closes apache#5042
@github-actions

Copy link
Copy Markdown
Contributor

👋 Thanks for your first contribution to Texera, @wwong0!

If you're looking for a good place to start, browse issues labeled starter-task; they're scoped to be approachable for newcomers.

You can drive common housekeeping yourself by commenting one of these commands on its own line:

  • Issues. Comment /take to assign an open issue to yourself, or /untake to release it. You can find unclaimed work with the search filter is:issue is:open no:assignee.
  • Sub-issues. To link issues into a parent/child hierarchy, comment /sub-issue #5166 #5222 on the parent to attach those children (or /unsub-issue #5166 #5222 to detach them). From a child issue, comment /parent-issue #5166 to set its parent, or /unparent-issue to clear it (the current parent is detected automatically). References may be written as #5166 or as a bare 5166; cross-repository references are not supported.
  • Pull requests (author only). Comment /request-review @user to request a review from someone, or /unrequest-review @user to withdraw that request.

Each command must match exactly: /take this will not work, only /take does. For the full contribution flow, see CONTRIBUTING.md.

@github-actions github-actions Bot added engine fix platform Non-amber Scala service paths labels Jun 16, 2026
@Yicong-Huang Yicong-Huang added the release/v1.2 back porting to release/v1.2 label Jun 17, 2026
@Yicong-Huang Yicong-Huang requested a review from Copilot June 17, 2026 05:25

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Jackson (de)serialization for LogicalLink so OperatorIdentity fields can round-trip through objectMapper.writeValueAsStringreadValue in both the Amber engine and workflow-compiling-service, while preserving each module’s strict vs. lenient validation behavior.

Changes:

  • Move @JsonCreator off the String convenience constructor and introduce a JsonNode-based creator that can read both string-shaped and object-shaped operator ids.
  • Add a readOperatorIdentity(node, fieldName) helper to parse fromOpId/toOpId from either JSON shape and centralize error handling.
  • Add/adjust unit tests to cover the round-trip behavior and module-specific strictness/leniency contracts.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
workflow-compiling-service/src/main/scala/org/apache/texera/amber/compiler/model/LogicalLink.scala Adds JsonNode @JsonCreator + helper to support both string/object OperatorIdentity JSON shapes.
workflow-compiling-service/src/test/scala/org/apache/texera/amber/compiler/model/LogicalLinkSpec.scala New spec covering leniency contract and Jackson round-trip behavior for compiler-service LogicalLink.
amber/src/main/scala/org/apache/texera/workflow/LogicalLink.scala Mirrors the JsonNode @JsonCreator + helper in the Amber engine LogicalLink (while keeping require guards).
amber/src/test/scala/org/apache/texera/workflow/LogicalLinkSpec.scala Updates characterization tests to assert successful round-trip after the fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread amber/src/main/scala/org/apache/texera/workflow/LogicalLink.scala
@codecov-commenter

codecov-commenter commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 53.80%. Comparing base (8a90f1f) to head (270205f).
⚠️ Report is 75 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5729      +/-   ##
============================================
+ Coverage     53.19%   53.80%   +0.61%     
- Complexity     2655     3003     +348     
============================================
  Files          1095     1112      +17     
  Lines         42398    43858    +1460     
  Branches       4560     4693     +133     
============================================
+ Hits          22552    23599    +1047     
- Misses        18515    18913     +398     
- Partials       1331     1346      +15     
Flag Coverage Δ *Carryforward flag
access-control-service 73.22% <ø> (+2.77%) ⬆️
agent-service 34.36% <ø> (ø) Carriedforward from a3d15a6
amber 55.05% <100.00%> (+1.57%) ⬆️
computing-unit-managing-service 1.29% <ø> (-0.36%) ⬇️
config-service 56.86% <ø> (+0.14%) ⬆️
file-service 58.59% <ø> (+1.52%) ⬆️
frontend 47.99% <ø> (ø) Carriedforward from a3d15a6
pyamber 90.13% <ø> (ø) Carriedforward from a3d15a6
python 90.80% <ø> (ø) Carriedforward from a3d15a6
workflow-compiling-service 63.92% <100.00%> (+5.22%) ⬆️

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 0 better · 🔴 7 worse · ⚪ 8 noise (<±5%) · 0 without baseline

Compared against main 8a90f1f benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 371 0.226 25,624/35,726/35,726 us 🔴 -13.1% / 🔴 +138.8%
🔴 bs=100 sw=10 sl=64 800 0.489 122,601/146,620/146,620 us 🔴 +5.3% / 🔴 +46.3%
bs=1000 sw=10 sl=64 918 0.56 1,085,544/1,147,347/1,147,347 us ⚪ within ±5% / 🔴 +21.4%
Baseline details

Latest main 8a90f1f from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 371 tuples/sec 426 tuples/sec 799.7 tuples/sec -12.9% -53.6%
bs=10 sw=10 sl=64 MB/s 0.226 MB/s 0.26 MB/s 0.488 MB/s -13.1% -53.7%
bs=10 sw=10 sl=64 p50 25,624 us 24,382 us 12,125 us +5.1% +111.3%
bs=10 sw=10 sl=64 p95 35,726 us 31,920 us 14,959 us +11.9% +138.8%
bs=10 sw=10 sl=64 p99 35,726 us 31,920 us 18,503 us +11.9% +93.1%
bs=100 sw=10 sl=64 throughput 800 tuples/sec 818 tuples/sec 1,069 tuples/sec -2.2% -25.1%
bs=100 sw=10 sl=64 MB/s 0.489 MB/s 0.499 MB/s 0.652 MB/s -2.0% -25.0%
bs=100 sw=10 sl=64 p50 122,601 us 122,101 us 92,709 us +0.4% +32.2%
bs=100 sw=10 sl=64 p95 146,620 us 139,194 us 100,227 us +5.3% +46.3%
bs=100 sw=10 sl=64 p99 146,620 us 139,194 us 109,877 us +5.3% +33.4%
bs=1000 sw=10 sl=64 throughput 918 tuples/sec 922 tuples/sec 1,114 tuples/sec -0.4% -17.6%
bs=1000 sw=10 sl=64 MB/s 0.56 MB/s 0.563 MB/s 0.68 MB/s -0.5% -17.6%
bs=1000 sw=10 sl=64 p50 1,085,544 us 1,079,210 us 899,197 us +0.6% +20.7%
bs=1000 sw=10 sl=64 p95 1,147,347 us 1,182,644 us 945,162 us -3.0% +21.4%
bs=1000 sw=10 sl=64 p99 1,147,347 us 1,182,644 us 986,231 us -3.0% +16.3%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,539.38,200,128000,371,0.226,25623.92,35725.97,35725.97
1,100,10,64,20,2498.78,2000,1280000,800,0.489,122600.61,146620.27,146620.27
2,1000,10,64,20,21788.16,20000,12800000,918,0.560,1085544.36,1147347.11,1147347.11

wwong0 added 2 commits June 17, 2026 05:53
…d in LogicalLink

Address review feedback on the LogicalLink `@JsonCreator` fix: the object-shape branch of `readOperatorIdentity` previously coerced a non-textual `id` (e.g. `{"id": 123}`) via `asText()`, silently turning it into `"123"`—inconsistent with the helper's contract of rejecting malformed nodes, and with the existing rejection of a non-text/non-object top-level node. Require `id` to be textual (or null/absent); otherwise throw `IllegalArgumentException`. Applied to both the amber engine and the workflow-compiling-service `LogicalLink` (identical helpers).

Extend both `LogicalLinkSpec` suites with branch-coverage tests for `readOperatorIdentity`: object-shape `id` missing, explicit null, numeric/non-textual `id`, and a top-level explicit-null op id. These pin the new rejection behavior.
@Yicong-Huang

Copy link
Copy Markdown
Contributor

Hi @wwong0, thanks for the PR. overall it looks good.

can you please make coverage higher, before I review details? you can find coverage report here #5729 (comment). thanks.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comment thread amber/src/main/scala/org/apache/texera/workflow/LogicalLink.scala
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: William Wong <120128836+wwong0@users.noreply.github.com>
@wwong0

wwong0 commented Jun 24, 2026

Copy link
Copy Markdown
Author

Hi @Yicong-Huang the CodeCov report updated after your comment, it now shows higher coverage.

@github-actions

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @aglinxinyuan, @Yicong-Huang
    You can notify them by mentioning @aglinxinyuan, @Yicong-Huang in a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine fix platform Non-amber Scala service paths release/v1.2 back porting to release/v1.2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LogicalLink: writeValueAsString / readValue is not round-trippable for OperatorIdentity fields

4 participants