New serverless pattern - lambda-durable-execution-java-cdk#3074
Conversation
… Java SDK pattern Deploy a resilient multi-step order processing workflow using the AWS Lambda Durable Execution SDK for Java (v1.0.1) with automatic checkpointing and failure recovery. Key features: - DurableHandler<Map, Map> base class with DurableContext - 5-step workflow: validate, reserve, pay, wait, ship - ctx.step() for checkpointed operations - ctx.wait() for zero-compute-cost suspension - Docker-based Java 17 Lambda with Maven build - CDK TypeScript infrastructure with DurableConfig escape hatch
Replace inline durable execution policy (wildcard resources) with the AWS managed policy for least-privilege IAM, matching the approach recommended in PR aws-samples#3053 review feedback.
|
Hi @biswanathmukherjee 👋 Friendly nudge — this pattern is ready for review. Deployed and tested end-to-end on a live AWS account. Would appreciate a look when you have time. Thank you! |
|
Hi @biswanathmukherjee 👋 This is the first Java-based durable execution pattern — a completely different SDK surface (DurableHandler<I,O>, DurableContext) from the Node.js version. Enterprise Java customers need this dedicated reference. Deployed and tested. |
|
Hi @bfreiberg 👋 — friendly nudge on this pattern. It's been deployed and tested end-to-end on a live AWS account. Happy to address any feedback. Thank you! |
|
|
||
| @Override | ||
| public Map<String, Object> handleRequest(Map<String, Object> input, DurableContext ctx) { | ||
| String orderId = (String) input.getOrDefault("orderId", UUID.randomUUID().toString()); |
There was a problem hiding this comment.
Non-deterministic UUID outside any durable operation.
UUID.randomUUID() produces a different value on every replay. On a replay after a mid-workflow interruption, the rebuilt orderId won't match what completed steps were keyed against, and any new branch that reads orderId from this line will see a different value than the original invocation. AWS docs flag UUID generation specifically as code that must be wrapped in a step.
There was a problem hiding this comment.
Fixed — wrapped in ctx.step("generate-order-id", String.class, stepCtx -> UUID.randomUUID().toString()) so the value is checkpointed and deterministic on replay. If orderId is provided in input, it's used directly without a step.
|
|
||
| // Step 1: Validate order | ||
| String validation = ctx.step("validate-order", String.class, stepCtx -> { | ||
| System.out.println("Validating order " + orderId); |
There was a problem hiding this comment.
The Java SDK's StepContext exposes getLogger() which returns a DurableLogger enriched with execution-context metadata (SDK Reference → Step → StepContext; Logging). System.out.println works (stdout reaches CloudWatch) but loses retry-attempt counters, replay flags, and the correlation fields the SDK adds.
There was a problem hiding this comment.
Fixed — replaced all System.out.println with stepCtx.getLogger().info() to get the DurableLogger metadata (retry-attempt counters, replay flags, correlation fields).
|
Thank you @parikhudit — both excellent catches! Fixed in commit 1c353ae0:
Redeployed and tested — durable execution completes successfully with both fixes. Pushing shortly. |
| ```bash | ||
| npm install | ||
| ``` | ||
| 5. Deploy the stack: |
There was a problem hiding this comment.
Add a one-line note for first-time CDK users:
e.g. If this is the first time you deploy a CDK stack in this account/region, run cdk bootstrap before cdk deploy.
There was a problem hiding this comment.
Added: > **Note:** If this is the first time you deploy a CDK stack in this account/region, run \cdk bootstrap` before `cdk deploy`.`
- Wrap UUID.randomUUID() in ctx.step() to ensure stable orderId across replays after mid-workflow interruption - Replace System.out.println with stepCtx.getLogger().info() for DurableLogger with retry-attempt counters, replay flags, and correlation metadata Addresses review feedback from parikhudit.
|
Also added the |
| @@ -0,0 +1,5 @@ | |||
| FROM maven:3.9-amazoncorretto-17 AS build | |||
There was a problem hiding this comment.
Unused Dockerfile, purpose unclear.
The CDK stack uses lambda.Code.fromAsset() and the README's Deployment step 3 runs mvn clean package -q directly on the host, so this Dockerfile is never invoked anywhere in the documented deploy flow.
-
If it's intended as a build-helper convenience (so contributors who don't have Maven installed locally can produce the JAR with docker build src/), please:
- Make the JAR extractable by adding a final stage like FROM scratch and COPY --from=build /app/target/*.jar / so the artifact can be pulled out cleanly with docker create + docker cp (without that, the only way to retrieve the JAR is to dig into a stopped build container, which is awkward).
- Mention it in the README so readers know the path exists, e.g.:
Alternative build (no Maven on host required, verify before using):
docker build -t durable-builder src/
docker create --name x durable-builder
docker cp x:/app/target/. src/target/
docker rm x
- If it's not actually used by anyone, please remove src/Dockerfile so it doesn't confuse readers leaving an unreferenced Dockerfile suggests Docker is part of the deploy story when it isn't.
Either way, dropping the unused file or documenting it as a build helper is fine what's important is that the README and the file agree on the intent.
There was a problem hiding this comment.
You're right, it's a leftover from an earlier Docker-based approach. Removed it — the deploy flow is just mvn package on host + Code.fromAsset() pointing at the JAR. Cleaner this way.
The CDK stack uses Code.fromAsset() with the pre-built JAR and the README instructs mvn clean package on the host. The Dockerfile was never part of the deploy flow.
| iam.ManagedPolicy.fromAwsManagedPolicyName( | ||
| "service-role/AWSLambdaBasicDurableExecutionRolePolicy" | ||
| ) | ||
| ); |
There was a problem hiding this comment.
Native durableConfig property + L2 Alias is cleaner than the L1 escape hatch.
The official CDK example (Deploy Lambda durable functions with IaC → AWS CDK) uses the native durableConfig property on lambda.Function and a regular lambda.Alias against fn.currentVersion:
const fn = new lambda.Function(this, 'DurableOrderProcessorFn', {
...,
durableConfig: {
executionTimeout: cdk.Duration.hours(1),
retentionPeriod: cdk.Duration.days(7),
},
});
const alias = new lambda.Alias(this, 'ProdAlias', {
aliasName: 'prod',
version: fn.currentVersion,
});
Two possible issues with the current escape-hatch approach:
- Mixing L2 Function with L1 CfnVersion/CfnAlias means fn.currentVersion won't reflect the published version, which could be an issue for any future code that reads it.
- The comment "to avoid CDK version property validation" suggests the targeted CDK version may have lacked the native property. If so, please add a comment with the minimum CDK version where the native property becomes available so a future contributor can clean this up; if the native property already exists in aws-cdk-lib@2.180.0 or latest, please switch to it.
There was a problem hiding this comment.
Great call. Upgraded to aws-cdk-lib@2.257.0 which has the native durableConfig property -- switched to that plus L2 Alias with fn.currentVersion. Much cleaner, no more escape hatches. The original approach was because 2.180.0 didn't have it yet.
Switched from L1 escape hatch (CfnFunction.addOverride + CfnVersion + CfnAlias) to native durableConfig property on lambda.Function with L2 lambda.Alias against fn.currentVersion. Requires aws-cdk-lib@2.257.0. Cleaner, type-safe, and consistent with official CDK docs.
New Serverless Pattern: Lambda Durable Execution with Java SDK
Description
Deploys a Lambda durable function written in Java that orchestrates a multi-step order processing workflow with automatic checkpointing and failure recovery using the Durable Execution SDK for Java (v1.0.1, GA April 2026).
Architecture
Key Features
DurableHandler<Map, Map>base class withDurableContextctx.step()for checkpointed operations,ctx.wait()for zero-cost suspensionDurableConfigescape hatchFramework / Language
Deployment & Testing
Files
lib/lambda-durable-execution-java-stack.tssrc/main/java/com/example/OrderProcessor.javasrc/pom.xmlsrc/Dockerfileexample-pattern.json