fix: reset metrics registry per benchmark cycle#48695
Merged
xinlian12 merged 1 commit intoAzure:mainfrom Apr 6, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes multi-cycle benchmark runs in azure-cosmos-benchmark where closing a cycle’s CosmosClient would clear()/close() the shared Micrometer registry (via ClientTelemetryMetrics.remove(...)), preventing subsequent cycles from emitting CSV metrics.
Changes:
- Moves metrics registry + destination reporter creation into the per-cycle lifecycle loop to ensure each cycle uses a fresh, disposable registry.
- Keeps console logging (
LoggingMeterRegistry) shared across cycles, and detaches it before client shutdown to avoid being cleared/closed by the SDK. - Stops (flushes) CSV/CosmosDB reporters before client shutdown so final interval data is written before the registry is destroyed.
...s/azure-cosmos-benchmark/src/main/java/com/azure/cosmos/benchmark/BenchmarkOrchestrator.java
Show resolved
Hide resolved
...s/azure-cosmos-benchmark/src/main/java/com/azure/cosmos/benchmark/BenchmarkOrchestrator.java
Outdated
Show resolved
Hide resolved
When running with -cycles 2+, the Cosmos SDK calls registry.clear() and registry.close() on client shutdown, destroying all meters. This caused cycle 2 metrics to be completely lost. Fix: create a fresh CompositeMeterRegistry, SimpleMeterRegistry, and CsvMetricsReporter for each cycle. Reporters are flushed before client shutdown, then the loggingRegistry is disconnected from the disposable cycle registry so the SDK can safely destroy it. Each cycle starts with clean counters (count=0). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
a8ca8e4 to
ee7dfe2
Compare
kushagraThapar
approved these changes
Apr 6, 2026
Member
Author
|
The failed tests: c.a.c.CosmosContainerChangeFeedTest.CosmosContainerChangeFeedTest -> does not related to changes in this PR |
Member
Author
|
/check-enforcer override |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When running the benchmark with
-cycles 2+, the Cosmos SDK callsregistry.clear()+registry.close()viaClientTelemetryMetrics.remove()when a CosmosClient is destroyed between cycles. Since the benchmark passed the sameCompositeMeterRegistryto every cycle's client, closing cycle 1's client wiped all meters from the shared registry. Cycle 2's CosmosClient created new meters (with a newClientCorrelationId), but the registry was in a broken state — no CSV data was written for cycle 2.Root Cause
ClientTelemetryMetrics.remove(registry)calls:registry.clear()— removes all meter instances from the registryregistry.close()— shuts down the registry so it stops accepting new metersBoth propagate through the
CompositeMeterRegistryto all children, destroying theSimpleMeterRegistrythatCsvMetricsReporterreads from.Fix
Create a fresh, disposable metrics pipeline per cycle:
CompositeMeterRegistry+SimpleMeterRegistry+CsvMetricsReporterLoggingMeterRegistry(console output) is added to each cycle's registry but disconnected before shutdown so the SDK'sclear()/close()doesn't destroy itcount=0)Verified
Tested with
-cycles 2 -settleTimeMs 30000on1t-c16-ReadThroughput-http1:ClientCorrelationId.00001CSV files existed (cycle 1 data only)00001and00002CSV files exist with complete, independent data