Skip to content

Add replay-aware logger to Amazon.Lambda.DurableExecution#2371

Open
GarrettBeatty wants to merge 1 commit into
GarrettBeatty/stack/3from
GarrettBeatty/stack/4
Open

Add replay-aware logger to Amazon.Lambda.DurableExecution#2371
GarrettBeatty wants to merge 1 commit into
GarrettBeatty/stack/3from
GarrettBeatty/stack/4

Conversation

@GarrettBeatty
Copy link
Copy Markdown
Contributor

@GarrettBeatty GarrettBeatty commented May 14, 2026

#2216


What

Implements context.Logger, the replay-aware ILogger described in Docs/durable-execution-design.md and shipped by the Python / Java / JavaScript reference SDKs.

Public API surface introduced:

Type Purpose
IDurableContext.Logger Replay-safe ILogger (was NullLogger.Instance).
IDurableContext.ConfigureLogger(LoggerConfig) Swap the inner logger and/or disable replay-aware filtering.
LoggerConfig CustomLogger + ModeAware configuration record.

Why

Without replay-aware logging, every Console.WriteLine (or any non-suppressing logger) repeats on every replay invocation. A 30-step workflow re-invoked 30 times produces 30 copies of every log line — noisy at best, misleading at worst. The reference SDKs all solve this by reading replay state on each log call and suppressing emission while the workflow is re-deriving prior operations from checkpoint state. This PR ports that behavior to .NET on top of the per-operation replay tracker introduced in #2360.

How

ReplayAwareLogger. An ILogger decorator that consults ExecutionState.IsReplaying on every call. Short-circuits both Log<TState> and IsEnabled during replay so LoggerExtensions.LogXxx doesn't even format the message string. BeginScope always passes through so the scope stack stays balanced — suppression only applies at log emission.

Default inner logger. LambdaCoreLogger — a minimal in-package adapter that delegates to Amazon.Lambda.Core.LambdaLogger.Log, so logs flow into the standard Lambda runtime pipeline (JSON when AWS_LAMBDA_LOG_FORMAT=JSON, level-filtered by AWS_LAMBDA_LOG_LEVEL). Two structured-logging behaviors:

  • When state is the FormattedLogValues produced by LoggerExtensions.LogXxx, the original template and named-argument values are forwarded so the runtime's JSON formatter surfaces {OrderId}-style placeholders as top-level structured attributes.
  • BeginScope maintains an AsyncLocal chain of scope state. KVP-shaped scope state is appended to the outgoing template as named placeholders (inner→outer order, inner wins on key collision; explicit message args win over scope keys), so durableExecutionArn / operationId / etc. show up as top-level JSON fields without callers having to swap in a third-party logger.

Mirrors the structured-logging pattern in Amazon.Lambda.Logging.AspNetCore.LambdaILogger. Avoids forcing a dependency on Amazon.Lambda.Logging.AspNetCore. Users who want Serilog/Powertools/etc. swap their own logger via ConfigureLogger.

Metadata scopes. DurableFunction.WrapAsyncCore opens a BeginScope around the workflow body carrying durableExecutionArn + awsRequestId. StepOperation opens a per-step scope (operationId, operationName, attempt) around the user-func invocation only. Combined with the scope-aware default logger above, every log line emitted by user code is automatically tagged with execution / step metadata.

Key files:

  • LoggerConfig.cs — public configuration type
  • Internal/ReplayAwareLogger.cs — the replay-aware decorator
  • Internal/LambdaCoreLogger.cs — default inner logger; preserves structured args + flattens scope chain
  • DurableContext.cs — replaces NullLogger default; implements ConfigureLogger
  • DurableFunction.cs — execution-level scope
  • Internal/StepOperation.cs — step-level scope around user func

Testing

Unit tests (21 new in Amazon.Lambda.DurableExecution.Tests):

  • ReplayAwareLoggerTests (7) — replay suppression, execution passthrough, ModeAware=false, IsEnabled short-circuit during replay, IsEnabled delegation during execution, BeginScope always-delegates, mid-workflow REPLAY → NEW transition (mirrors Python's test_logger_replay_then_new_logging).
  • DurableContextTests (3 new + 1 updated) — ConfigureLogger_NullArg_Throws, ConfigureLogger_WithCustomLogger_ReachesUserLogger, ConfigureLogger_ModeAwareFalse_LogsDuringReplay. The pre-existing Logger_Defaults_ToNullLogger is updated to Logger_Default_IsReplayAwareLogger to assert the new default.
  • LambdaCoreLoggerTests (11) — installs capture delegates into LambdaLogger._loggingWithLevelAction (the same hook RuntimeSupport uses) and asserts: named placeholders + arg values are forwarded intact, the exception variant works, plain messages pass through as literals, non-FormattedLogValues state falls back to formatter(state, exception), IsEnabled(None) returns false, KVP scopes are appended, nested scopes flatten inner→outer with inner winning on collision, explicit message args win over scope keys, scope is popped on dispose, AsyncLocal isolates concurrent tasks, and non-KVP scopes are ignored.

Integration test (ReplayAwareLoggerTest in Amazon.Lambda.DurableExecution.IntegrationTests):

End-to-end proof on real AWS infra. Deploys a step → wait(3s) → step workflow that pairs each context.Logger.LogInformation line with a Console.WriteLine ''control'' line; the test function runs with AWS_LAMBDA_LOG_FORMAT=JSON. After the durable execution completes (across two invocations driven by the wait), queries CloudWatch Logs and asserts:

  • Each replay-aware line appears exactly once across both invocations.
  • Each control line appears once per invocation that reached it (proving the function genuinely replayed).
  • Parsed JSON log records carry the expected scope-derived top-level fields: durableExecutionArn + awsRequestId on workflow-level lines; additionally operationId + operationName + attempt on lines emitted inside a step delegate.

This pins both the replay-suppression contract and the structured-scope contract end-to-end against the actual durable-execution service.

Out of scope (follow-up PRs)

  • MapAsync / ParallelAsync / RunInChildContextAsync / WaitForConditionAsync
  • CallbackAsync, InvokeAsync
  • DefaultJsonCheckpointSerializer
  • Annotations source-generator integration / [DurableExecution] attribute
  • DurableTestRunner / Amazon.Lambda.DurableExecution.Testing package
  • dotnet new lambda.DurableFunction blueprint


COPY bin/publish/ ${LAMBDA_TASK_ROOT}

ENTRYPOINT ["/var/task/bootstrap"]
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 7ca2099 to 5a29b3e Compare May 17, 2026 20:16
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/4 branch 2 times, most recently from 0ad914a to 714d2d6 Compare May 18, 2026 01:23
@GarrettBeatty GarrettBeatty requested a review from Copilot May 18, 2026 01:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a replay-aware ILogger implementation to Amazon.Lambda.DurableExecution so workflow logs don’t duplicate during replay, and exposes a small configuration surface for swapping the underlying logger and toggling replay filtering.

Changes:

  • Introduces ReplayAwareLogger + LambdaCoreLogger, wires IDurableContext.Logger to be replay-aware by default, and adds IDurableContext.ConfigureLogger(LoggerConfig).
  • Adds execution- and step-level BeginScope metadata for structured loggers.
  • Adds unit tests and a CloudWatch-based integration test to validate replay suppression end-to-end.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
Libraries/src/Amazon.Lambda.DurableExecution/IDurableContext.cs Updates public context API with replay-safe Logger docs and new ConfigureLogger(LoggerConfig) method.
Libraries/src/Amazon.Lambda.DurableExecution/LoggerConfig.cs Adds public configuration type for swapping inner logger and toggling replay-aware suppression.
Libraries/src/Amazon.Lambda.DurableExecution/Internal/ReplayAwareLogger.cs Adds replay-suppressing ILogger decorator driven by ExecutionState.IsReplaying.
Libraries/src/Amazon.Lambda.DurableExecution/Internal/LambdaCoreLogger.cs Adds default in-package logger adapter that routes to Amazon.Lambda.Core.LambdaLogger.
Libraries/src/Amazon.Lambda.DurableExecution/DurableContext.cs Defaults Logger to replay-aware logger and implements ConfigureLogger.
Libraries/src/Amazon.Lambda.DurableExecution/DurableFunction.cs Adds execution-level logging scope for structured metadata.
Libraries/src/Amazon.Lambda.DurableExecution/Internal/StepOperation.cs Adds step-level logging scope for operation metadata around user step invocation.
Libraries/test/Amazon.Lambda.DurableExecution.Tests/Internal/ReplayAwareLoggerTests.cs Adds unit tests for replay suppression, scope passthrough, and mode transitions.
Libraries/test/Amazon.Lambda.DurableExecution.Tests/DurableContextTests.cs Adds unit tests for default logger type and ConfigureLogger behavior.
Libraries/test/Amazon.Lambda.DurableExecution.IntegrationTests/Amazon.Lambda.DurableExecution.IntegrationTests.csproj Adds CloudWatch Logs SDK dependency for log verification.
Libraries/test/Amazon.Lambda.DurableExecution.IntegrationTests/ReplayAwareLoggerTest.cs Adds CloudWatch-based integration test validating replay suppression vs Console control lines.
Libraries/test/Amazon.Lambda.DurableExecution.IntegrationTests/TestFunctions/ReplayAwareLoggerFunction/ReplayAwareLoggerFunction.csproj Adds new integration-test Lambda function project.
Libraries/test/Amazon.Lambda.DurableExecution.IntegrationTests/TestFunctions/ReplayAwareLoggerFunction/Function.cs Implements Step→Wait→Step workflow emitting replay-aware and control log markers.
Libraries/test/Amazon.Lambda.DurableExecution.IntegrationTests/TestFunctions/ReplayAwareLoggerFunction/Dockerfile Adds container packaging for the new integration-test function.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Libraries/src/Amazon.Lambda.DurableExecution/Internal/LambdaCoreLogger.cs Outdated
Comment thread Libraries/src/Amazon.Lambda.DurableExecution/DurableFunction.cs
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch 6 times, most recently from 8e29ff3 to 097e5a1 Compare May 20, 2026 15:37
Implement context.Logger, the replay-aware ILogger described in
Docs/durable-execution-design.md and shipped by the Python / Java / JS
reference SDKs. Messages emitted while the workflow is replaying prior
operations are suppressed, so a 30-step workflow re-invoked 30 times
emits each LogInformation line once instead of 30 times.

Public API:
- IDurableContext.Logger — was NullLogger.Instance, now a replay-safe
  ILogger backed by Amazon.Lambda.Core.LambdaLogger so logs flow into
  the standard runtime pipeline (JSON when AWS_LAMBDA_LOG_FORMAT=JSON,
  level-filtered by AWS_LAMBDA_LOG_LEVEL).
- IDurableContext.ConfigureLogger(LoggerConfig) — swap the inner
  ILogger (Serilog, Powertools, etc.) and/or disable replay-aware
  filtering (ModeAware = false) for debugging. Matches the API shape
  documented in the design doc.

Internals:
- ReplayAwareLogger — ILogger decorator that consults
  ExecutionState.IsReplaying on every Log call. Short-circuits both
  Log<TState> and IsEnabled during replay so LoggerExtensions.LogXxx
  doesn't even format the string. BeginScope always passes through so
  the scope stack stays balanced.
- LambdaCoreLogger — minimal in-package adapter that delegates to
  Amazon.Lambda.Core.LambdaLogger.Log. Avoids forcing a dependency on
  Amazon.Lambda.Logging.AspNetCore.
- DurableFunction.WrapAsyncCore opens a BeginScope around the workflow
  body carrying durableExecutionArn + awsRequestId. StepOperation
  opens a per-step scope (operationId, operationName, attempt) around
  the user-func invocation only. Structured log providers (the
  runtime's JSON formatter, Serilog, etc.) tag every log line emitted
  by user code with that metadata automatically.

Tests:
- ReplayAwareLoggerTests — 7 unit tests: replay suppression, execution
  passthrough, ModeAware=false, IsEnabled short-circuit, scope
  passthrough, mid-workflow REPLAY→NEW transition (mirrors Python's
  test_logger_replay_then_new_logging).
- DurableContextTests — coverage for the default logger, ConfigureLogger
  with a custom logger, and ConfigureLogger { ModeAware = false }
  enabling logs during replay.
- ReplayAwareLoggerTest (integration) — deploys a Step → Wait → Step
  workflow that pairs each context.Logger.LogInformation line with a
  Console.WriteLine "control" line. After the durable execution
  completes, queries CloudWatch Logs and asserts each replay-aware
  line appears exactly once across both invocations while each control
  line appears once per invocation, proving the suppression works
  end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Forward template + args from LambdaCoreLogger

When state is FormattedLogValues, extract {OriginalFormat} and pass the
original template + named-argument values through to LambdaLogger.Log
instead of pre-rendering. Mirrors the pattern in
Amazon.Lambda.Logging.AspNetCore.LambdaILogger so the runtime's JSON
formatter can surface {OrderId}-style placeholders as top-level
structured attributes.

Make LambdaCoreLogger scope-aware

BeginScope now maintains an AsyncLocal chain of scope state. On Log,
KVP-shaped scope state is appended to the template as named placeholders
(inner→outer order, inner wins on key collision; explicit message args
win over scope keys). The runtime's JSON formatter promotes the keys
to top-level fields, so durableExecutionArn / operationId / etc. show
up as structured attributes without callers having to swap in a
third-party logger.

Unit tests cover ordering, nested scopes, message-arg precedence,
AsyncLocal isolation, and non-KVP fallback. The integration test now
sets AWS_LAMBDA_LOG_FORMAT=JSON, adds a step-internal log line, and
asserts the scope-derived fields land on the parsed JSON record.
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/4 branch from 1483884 to 61f37ec Compare May 20, 2026 15:50
@GarrettBeatty GarrettBeatty requested a review from Copilot May 20, 2026 15:51
@GarrettBeatty GarrettBeatty added the Release Not Needed Add this label if a PR does not need to be released. label May 20, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

@GarrettBeatty GarrettBeatty marked this pull request as ready for review May 20, 2026 16:02
@GarrettBeatty GarrettBeatty requested review from a team as code owners May 20, 2026 16:02
@GarrettBeatty GarrettBeatty requested review from normj and philasmar and removed request for a team May 20, 2026 16:02
@philasmar
Copy link
Copy Markdown
Contributor

Important

1. IsEnabled always returns true regardless of runtime log level (LambdaCoreLogger.cs:49)

public bool IsEnabled(LogLevel logLevel) => logLevel != LogLevel.None;

The Lambda runtime filters based on AWS_LAMBDA_LOG_LEVEL, but IsEnabled doesn't reflect that. This means LoggerExtensions.LogXxx will still format the message string
(allocating the FormattedLogValues, running string interpolation) even when the runtime will discard it. For high-frequency logging in step bodies, this could cause unnecessary
GC pressure.

Mitigation options: parse AWS_LAMBDA_LOG_LEVEL / DOTNET_LAMBDA_LOG_LEVEL once at construction and cache it. However, this matches the pattern in
Amazon.Lambda.Logging.AspNetCore.LambdaILogger (which also defers filtering to the runtime), so it's defensible as-is for V1.

  1. Template mutation via scope appending (LambdaCoreLogger.cs:100-135)

The scope flattening approach appends {key} placeholders to the message template string. This works brilliantly with the Lambda runtime's JSON formatter, but it's worth noting:

  • If a user swaps in a custom logger via ConfigureLogger that doesn't understand the scope-augmented template, the scope keys still land in the template forwarded to
    LambdaCoreLogger -- but that path is taken only when LambdaCoreLogger is the inner. If they swap the inner, their logger gets the original template via ReplayAwareLogger
    delegation. This is fine by construction.
  • The StringBuilder allocation per scoped log call is acceptable given Lambda's per-invocation lifecycle, but worth noting for future perf work.
  1. Thread-safety of _logger field in DurableContext (DurableContext.cs:19, 44)
  private ILogger _logger;
  // ...
  public ILogger Logger => _logger;
  public void ConfigureLogger(LoggerConfig config) { ... _logger = new ReplayAwareLogger(...); }

The _logger field is read via Logger and written via ConfigureLogger without synchronization. In practice this is safe because durable workflows are single-threaded (the async
state machine runs on one thread at a time), but if ConfigureLogger is called from one task while another reads Logger, there's a theoretical torn-read concern. A volatile
annotation would make the intent explicit, though it's not strictly necessary given the execution model.


Questions

  1. Should ConfigureLogger be callable after the workflow has started executing steps? Currently nothing prevents it, but calling it mid-workflow could change suppression
    behavior for already-in-progress replay transitions. Is this intentional flexibility or should there be a "frozen after first step" guard?
  2. The LambdaCoreLogger tests use reflection to swap LambdaLogger._loggingWithLevelAction. If that field name changes in a future Amazon.Lambda.Core release, these tests
    silently break (null reference). Is there a way to make this coupling more explicit (e.g., an internal-visible-to or a compile-time check)?

@GarrettBeatty
Copy link
Copy Markdown
Contributor Author

Important

1. IsEnabled always returns true regardless of runtime log level (LambdaCoreLogger.cs:49)

public bool IsEnabled(LogLevel logLevel) => logLevel != LogLevel.None;

The Lambda runtime filters based on AWS_LAMBDA_LOG_LEVEL, but IsEnabled doesn't reflect that. This means LoggerExtensions.LogXxx will still format the message string (allocating the FormattedLogValues, running string interpolation) even when the runtime will discard it. For high-frequency logging in step bodies, this could cause unnecessary GC pressure.

Mitigation options: parse AWS_LAMBDA_LOG_LEVEL / DOTNET_LAMBDA_LOG_LEVEL once at construction and cache it. However, this matches the pattern in Amazon.Lambda.Logging.AspNetCore.LambdaILogger (which also defers filtering to the runtime), so it's defensible as-is for V1.

2. Template mutation via scope appending (LambdaCoreLogger.cs:100-135)

The scope flattening approach appends {key} placeholders to the message template string. This works brilliantly with the Lambda runtime's JSON formatter, but it's worth noting:

* If a user swaps in a custom logger via ConfigureLogger that doesn't understand the scope-augmented template, the scope keys still land in the template forwarded to
  LambdaCoreLogger -- but that path is taken only when LambdaCoreLogger is the inner. If they swap the inner, their logger gets the original template via ReplayAwareLogger
  delegation. This is fine by construction.

* The StringBuilder allocation per scoped log call is acceptable given Lambda's per-invocation lifecycle, but worth noting for future perf work.


3. Thread-safety of _logger field in DurableContext (DurableContext.cs:19, 44)
  private ILogger _logger;
  // ...
  public ILogger Logger => _logger;
  public void ConfigureLogger(LoggerConfig config) { ... _logger = new ReplayAwareLogger(...); }

The _logger field is read via Logger and written via ConfigureLogger without synchronization. In practice this is safe because durable workflows are single-threaded (the async state machine runs on one thread at a time), but if ConfigureLogger is called from one task while another reads Logger, there's a theoretical torn-read concern. A volatile annotation would make the intent explicit, though it's not strictly necessary given the execution model.

Questions

1. Should ConfigureLogger be callable after the workflow has started executing steps? Currently nothing prevents it, but calling it mid-workflow could change suppression
   behavior for already-in-progress replay transitions. Is this intentional flexibility or should there be a "frozen after first step" guard?

2. The LambdaCoreLogger tests use reflection to swap LambdaLogger._loggingWithLevelAction. If that field name changes in a future Amazon.Lambda.Core release, these tests
   silently break (null reference). Is there a way to make this coupling more explicit (e.g., an internal-visible-to or a compile-time check)?

for the logging comment. the runtime will still determine whether to log it or not internally.

Template mutation via scope appending (LambdaCoreLogger.cs:100-135)
i dont think there is any concern here?

Thread-safety of _logger field in DurableContext (DurableContext.cs:19, 44)
same here i dont really think this is possible?

Should ConfigureLogger be callable after the workflow has started executing steps? Currently nothing prevents it, but calling it mid-workflow could change suppression
behavior for already-in-progress replay transitions. Is this intentional flexibility or should there be a "frozen after first step" guard?

i also dont think this is a big deal?

The LambdaCoreLogger tests use reflection to swap LambdaLogger._loggingWithLevelAction. If that field name changes in a future Amazon.Lambda.Core release, these tests
silently break (null reference). Is there a way to make this coupling more explicit (e.g., an internal-visible-to or a compile-time check)?

i dont think this is worth changing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Release Not Needed Add this label if a PR does not need to be released.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants