fix(workflow-compiling-service): LogicalLink OperatorIdentity round-trip#5729
fix(workflow-compiling-service): LogicalLink OperatorIdentity round-trip#5729wwong0 wants to merge 4 commits into
Conversation
…ound-trip
Prior to this change, `@JsonCreator` was placed on the `String`
convenience constructor in both `amber` and `workflow-compiling-service`
`LogicalLink`. Jackson serialises `OperatorIdentity` as an object
(`{"id":"op-A"}`), so any round-trip through `writeValueAsString` /
`readValue` failed with `MismatchedInputException` because the String
constructor cannot accept an object node.
Fix: add a private `readOperatorIdentity(JsonNode, String)` helper to
the companion object and move `@JsonCreator` to a new `JsonNode`
constructor that delegates to it. The helper handles the string shape
(front-end input), the object shape (serialised form), null/absent
fields, and rejects non-text / non-object nodes with
`IllegalArgumentException`.
The new `workflow-compiling-service` `LogicalLinkSpec` adds direct unit
coverage of the `readOperatorIdentity` branches that the existing
HTTP-layer-only `WorkflowCompilationResourceSpec` did not reach. It also
pins the leniency contract (no `require` guards) and the Jackson
annotation contract (`@JsonProperty` key pinning, round-trip fidelity).
The `amber` `LogicalLinkSpec` is updated to match the renamed constructor
section, drop the now-invalid MismatchedInputException expectation, and add
the round-trip test.
Closes apache#5042
|
👋 Thanks for your first contribution to Texera, @wwong0! If you're looking for a good place to start, browse issues labeled You can drive common housekeeping yourself by commenting one of these commands on its own line:
Each command must match exactly: |
There was a problem hiding this comment.
Pull request overview
This PR fixes Jackson (de)serialization for LogicalLink so OperatorIdentity fields can round-trip through objectMapper.writeValueAsString → readValue in both the Amber engine and workflow-compiling-service, while preserving each module’s strict vs. lenient validation behavior.
Changes:
- Move
@JsonCreatoroff the String convenience constructor and introduce aJsonNode-based creator that can read both string-shaped and object-shaped operator ids. - Add a
readOperatorIdentity(node, fieldName)helper to parsefromOpId/toOpIdfrom either JSON shape and centralize error handling. - Add/adjust unit tests to cover the round-trip behavior and module-specific strictness/leniency contracts.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| workflow-compiling-service/src/main/scala/org/apache/texera/amber/compiler/model/LogicalLink.scala | Adds JsonNode @JsonCreator + helper to support both string/object OperatorIdentity JSON shapes. |
| workflow-compiling-service/src/test/scala/org/apache/texera/amber/compiler/model/LogicalLinkSpec.scala | New spec covering leniency contract and Jackson round-trip behavior for compiler-service LogicalLink. |
| amber/src/main/scala/org/apache/texera/workflow/LogicalLink.scala | Mirrors the JsonNode @JsonCreator + helper in the Amber engine LogicalLink (while keeping require guards). |
| amber/src/test/scala/org/apache/texera/workflow/LogicalLinkSpec.scala | Updates characterization tests to assert successful round-trip after the fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5729 +/- ##
============================================
+ Coverage 53.19% 53.80% +0.61%
- Complexity 2655 3003 +348
============================================
Files 1095 1112 +17
Lines 42398 43858 +1460
Branches 4560 4693 +133
============================================
+ Hits 22552 23599 +1047
- Misses 18515 18913 +398
- Partials 1331 1346 +15
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 371 | 0.226 | 25,624/35,726/35,726 us | 🔴 -13.1% / 🔴 +138.8% |
| 🔴 | bs=100 sw=10 sl=64 | 800 | 0.489 | 122,601/146,620/146,620 us | 🔴 +5.3% / 🔴 +46.3% |
| ⚪ | bs=1000 sw=10 sl=64 | 918 | 0.56 | 1,085,544/1,147,347/1,147,347 us | ⚪ within ±5% / 🔴 +21.4% |
Baseline details
Latest main 8a90f1f from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 371 tuples/sec | 426 tuples/sec | 799.7 tuples/sec | -12.9% | -53.6% |
| bs=10 sw=10 sl=64 | MB/s | 0.226 MB/s | 0.26 MB/s | 0.488 MB/s | -13.1% | -53.7% |
| bs=10 sw=10 sl=64 | p50 | 25,624 us | 24,382 us | 12,125 us | +5.1% | +111.3% |
| bs=10 sw=10 sl=64 | p95 | 35,726 us | 31,920 us | 14,959 us | +11.9% | +138.8% |
| bs=10 sw=10 sl=64 | p99 | 35,726 us | 31,920 us | 18,503 us | +11.9% | +93.1% |
| bs=100 sw=10 sl=64 | throughput | 800 tuples/sec | 818 tuples/sec | 1,069 tuples/sec | -2.2% | -25.1% |
| bs=100 sw=10 sl=64 | MB/s | 0.489 MB/s | 0.499 MB/s | 0.652 MB/s | -2.0% | -25.0% |
| bs=100 sw=10 sl=64 | p50 | 122,601 us | 122,101 us | 92,709 us | +0.4% | +32.2% |
| bs=100 sw=10 sl=64 | p95 | 146,620 us | 139,194 us | 100,227 us | +5.3% | +46.3% |
| bs=100 sw=10 sl=64 | p99 | 146,620 us | 139,194 us | 109,877 us | +5.3% | +33.4% |
| bs=1000 sw=10 sl=64 | throughput | 918 tuples/sec | 922 tuples/sec | 1,114 tuples/sec | -0.4% | -17.6% |
| bs=1000 sw=10 sl=64 | MB/s | 0.56 MB/s | 0.563 MB/s | 0.68 MB/s | -0.5% | -17.6% |
| bs=1000 sw=10 sl=64 | p50 | 1,085,544 us | 1,079,210 us | 899,197 us | +0.6% | +20.7% |
| bs=1000 sw=10 sl=64 | p95 | 1,147,347 us | 1,182,644 us | 945,162 us | -3.0% | +21.4% |
| bs=1000 sw=10 sl=64 | p99 | 1,147,347 us | 1,182,644 us | 986,231 us | -3.0% | +16.3% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,539.38,200,128000,371,0.226,25623.92,35725.97,35725.97
1,100,10,64,20,2498.78,2000,1280000,800,0.489,122600.61,146620.27,146620.27
2,1000,10,64,20,21788.16,20000,12800000,918,0.560,1085544.36,1147347.11,1147347.11…d in LogicalLink
Address review feedback on the LogicalLink `@JsonCreator` fix: the object-shape branch of `readOperatorIdentity` previously coerced a non-textual `id` (e.g. `{"id": 123}`) via `asText()`, silently turning it into `"123"`—inconsistent with the helper's contract of rejecting malformed nodes, and with the existing rejection of a non-text/non-object top-level node. Require `id` to be textual (or null/absent); otherwise throw `IllegalArgumentException`. Applied to both the amber engine and the workflow-compiling-service `LogicalLink` (identical helpers).
Extend both `LogicalLinkSpec` suites with branch-coverage tests for `readOperatorIdentity`: object-shape `id` missing, explicit null, numeric/non-textual `id`, and a top-level explicit-null op id. These pin the new rejection behavior.
…k-operatoridentity-round-trip
|
Hi @wwong0, thanks for the PR. overall it looks good. can you please make coverage higher, before I review details? you can find coverage report here #5729 (comment). thanks. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: William Wong <120128836+wwong0@users.noreply.github.com>
|
Hi @Yicong-Huang the CodeCov report updated after your comment, it now shows higher coverage. |
Automated Reviewer SuggestionsBased on the
|
What changes were proposed in this PR?
Bug fix:
LogicalLink@JsonCreatorconstructor (amberandworkflow-compiling-service)@JsonCreatorwas previously placed on theStringconvenience constructor ofLogicalLinkin both modules.OperatorIdentityis a case class that Jackson serializes as an object ({"id":"op-A"}), not a plain string. When reading back a serializedLogicalLink, Jackson dispatched to the@JsonCreatorString constructor but could not coerce the{"id":"op-A"}object node to aString, causing anywriteValueAsString→readValueround-trip to fail withMismatchedInputException.The fix introduces a private
readOperatorIdentity(node: JsonNode, fieldName: String)helper in the companion object and moves@JsonCreatorto a newJsonNodeconstructor that delegates to it. The helper accepts both the plain-string shape (front-end input) and the object shape (serialized form), maps null/absent ids toOperatorIdentity(null), and rejects malformed nodes. TheStringconvenience constructor is retained but no longer carries@JsonCreator.Any related issues, documentation, discussions?
Closes #5042
This PR continues and adopts work from #5175
How was this PR tested?
The new
workflow-compiling-serviceLogicalLinkSpecadds unit coverage ofLogicalLinkandreadOperatorIdentitymodel-level logic that existing tests do not cover and pins the leniency contract (norequireguards in the compiler-service variant).The existing
amberLogicalLinkSpecwas updated to match the renamed constructor section, drop the now-invalidMismatchedInputExceptionexpectation, and add a passing round-trip test.Run with:
Result: 18/18 tests pass
Result: 15/15 tests pass
Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Sonnet 4.6