Conversation
|
/coder-agents-review |
There was a problem hiding this comment.
Good addition. The tumbling-window aggregation design is proportional to the problem, the TelemetryReporter interface extraction keeps coupling narrow, and the NOOP sentinel pattern for throwaway clients is clean. The test harness is well-structured with fake timers, meaningful assertions, and non-vacuous coverage.
Severity summary: 1 P2, 13 P3, 1 P4, 5 Nit, 2 Note.
The P2 is the UUID regex version nibble constraint ([1-5]), which rejects UUIDv6/v7/v8 in the fallback normalizer. Seven reviewers flagged it independently. The fix is one character.
Of the P3s, the most impactful cluster: dispose() drops accumulated data without flushing, the timer callback has no try/catch (a throw permanently kills the chain), and average()/percentile95() produce NaN/undefined on empty input. All are straightforward defensive fixes.
On the design side: the route normalization fallback passes name-bearing tail segments (like usernames after /members/) through unchanged, which partially contradicts the stated goal of stripping user/workspace names. And the interceptor registration order means telemetry records phantom errors for requests that are transparently retried by the cert-refresh or auth interceptors. Neither is urgent, but both are worth acknowledging.
One reviewer questioned whether the windowSeconds config knob earns its ~115 lines of schema, reader, watcher, update, and tests given nobody will change the default. Worth a human decision.
"A public surface with zero in-tree callers is a foot-gun, not dead code." (Kite, on the UUID fallback path)
🤖 This review was automatically generated with Coder Agents.
Drop the recorder interface, noop sentinels, and creation helper — construct HttpRequestsTelemetry unconditionally and let the noop reporter swallow .log() calls. Flatten route normalization rules, switch buckets to a nested map (no string key encoding), and stop pushing zero-duration samples when metadata is absent so p95 isn't skewed. Drop the LOCAL_SINK_SETTING backwards-compat alias.
cd6bd55 to
64f09d1
Compare
- UUID regex matches any version (v6/v7/v8 now normalize). Then dropped
UUID/numeric heuristics entirely; route normalization uses explicit
rules only. Added api/v2/workspaces/{id}/builds/{id} for the main
composite case. Trailing literal segments still pass through.
- Flush the pending bucket on dispose so the last partial window is
not lost.
- Drop the coder.telemetry.local.httpRequests.windowSeconds knob and
hardcode 60s, removing the schema, reader, watcher, update method,
and related tests.
- Rename addLoggingInterceptors -> addRequestInterceptors. Document
that cert-refresh retries re-enter the chain so each attempt is
recorded in telemetry.
- Add explicit rule for /api/v2/organizations/{id}/members/{name}.
- Use toSorted, drop the unused export, comment normalizeByRule.
- Drop "aggregated" from the CHANGELOG entry.
- Tests: dispose-flush, no-rule fallback, undefined / "" inputs.
|
/coder-agents-review |
There was a problem hiding this comment.
Strong response to R1. All 19 addressed findings verified: UUID regex eliminated entirely (class-of-bug fix), config surface removed, dispose flushes, try/finally protects the timer chain, empty-array guards return 0, naming and docs cleaned up. The normalization redesign (position-based rules, no UUID regex) is cleaner than what it replaced.
DEREM-20 (contested, interceptor ordering): panel closed 3/3 accept. The author's defense holds: each interceptor-level attempt is a real network event, cert-refresh frequency is bounded, and the added comment at coderApi.ts:520-522 documents the counting semantics.
Severity summary for new findings: 1 P2, 4 P3.
The P2 is a regression from the R1 refactor: removing the NOOP sentinel check means every CoderApi.create() now constructs a real HttpRequestsTelemetry with a running 60-second timer, including 5-6 throwaway call sites that never dispose. The fix is straightforward: lazy-start the timer on first record() instead of in the constructor.
"What happens when the extension runs for a week in a remote window?" (Hisoka, on the timer leak)
🤖 This review was automatically generated with Coder Agents.
|
|
||
| public constructor(telemetry: TelemetryReporter) { | ||
| this.#telemetry = telemetry; | ||
| this.#scheduleNextWindow(); |
There was a problem hiding this comment.
P2 [DEREM-28] Constructor eagerly starts a 60-second repeating timer via this.#scheduleNextWindow(). Every CoderApi.create() call instantiates a real HttpRequestsTelemetry, including 5-6 throwaway call sites that never call dispose() (loginCoordinator.ts:194, loginCoordinator.ts:225, sessionManager.ts:291, authorizer.ts:67, container.ts:87, deploymentManager.ts:98). Each leaks a setTimeout chain that fires every 60s, iterates an empty bucket map, calls a NOOP log(), and reschedules forever. The timer closure prevents GC.
This is a regression from R1, where createHttpRequestsTelemetry checked telemetry === NOOP_TELEMETRY_REPORTER and returned a NOOP object with no timer. That sentinel was removed in the refactor.
prepareOAuthOperation runs on every OAuth token refresh (typically hourly). A day-long session accumulates ~24 leaked timers.
Fix (cheapest): defer #scheduleNextWindow() until the first #record() call. Clients that never record a request never start a timer:
#record(...): void {
if (this.#disposed) { return; }
if (!this.#timer) { this.#scheduleNextWindow(); }
// ... rest of record
}Remove the constructor call. (Hisoka P3, Kite P3)
🤖
| ); | ||
| } | ||
|
|
||
| public dispose(): void { |
There was a problem hiding this comment.
P3 [DEREM-29] dispose() is not exception-safe. If #flush() throws at line 85:
#disposedstaysfalse(line 86 not reached)- Timer is never cleared (lines 87-89 not reached)
- The throw propagates into
CoderApi.dispose(), skipping WebSocket and socket-set cleanup
The timer callback already has try/finally (DEREM-19 fix). Apply the same pattern here:
public dispose(): void {
if (this.#disposed) { return; }
try { this.#flush(); } finally {
this.#disposed = true;
if (this.#timer) {
clearTimeout(this.#timer);
this.#timer = null;
}
}
}Current TelemetryService.log() uses #safeEmit (catch-all), so this requires a future reporter implementation that doesn't swallow. Defensive fix, same justification as DEREM-19. (Mafuuu P3, Meruem P3, Razor P3)
🤖
| EVENT_NAME, | ||
| { method, route }, | ||
| { | ||
| window_seconds: WINDOW_SECONDS, |
There was a problem hiding this comment.
P3 [DEREM-30] Dispose-time flush emits window_seconds: 60 regardless of actual elapsed time since the last flush. When the extension deactivates mid-window, a consumer computing request rates from count / window_seconds will understate the rate for the final event.
This is the residual of DEREM-9 (which was closed because updateConfig was removed). The disposal case fires once at shutdown, so the practical impact is minor. Noting it because four reviewers flagged it independently and the fix (track window-start time, emit actual elapsed) would also benefit any future partial-window flush. (Chopper P3, Mafuuu Note, Kite Note, Meruem Note)
🤖
|
|
||
| import type { CallerMeasurements, CallerProperties } from "./event"; | ||
|
|
||
| export interface TelemetryReporter { |
There was a problem hiding this comment.
P3 [DEREM-31] TelemetryReporter interface has no doc comment explaining why it exists. This is the new abstraction boundary between CoderApi and TelemetryService. A developer opening this file sees three method signatures and a NOOP constant with no explanation of the decoupling contract.
Similarly, HttpRequestsTelemetry (httpRequestsTelemetry.ts:51) has no doc comment. It starts a repeating timer in its constructor (a side effect not obvious from the class name), recordError silently drops non-Axios errors, and dispose flushes pending data. These are preconditions and postconditions that belong on the class.
Suggested:
// reporter.ts
/** Decouples callers from the concrete TelemetryService so throwaway
* or telemetry-disabled clients can use NOOP_TELEMETRY_REPORTER
* without constructing a full service. */
// httpRequestsTelemetry.ts
/** Aggregates HTTP request counts and latencies into 60-second tumbling
* windows and emits "http.requests" events via the provided reporter.
* Starts a repeating flush timer on construction. Call dispose() to
* flush pending data and stop the timer. */(Leorio)
🤖
| const client = new CoderApi(output); | ||
| const httpRequestsTelemetry = new HttpRequestsTelemetry(telemetry); | ||
| const client = new CoderApi(output, httpRequestsTelemetry); | ||
| client.setCredentials(baseUrl, token); |
There was a problem hiding this comment.
P3 [DEREM-32] CoderApi.create doc comment says "Automatically sets up logging interceptors and certificate handling." This PR added a telemetry parameter and wires HttpRequestsTelemetry into the instance. The comment now lies by omission. Someone writing a second factory path or test helper might skip the telemetry parameter thinking it is cosmetic.
Add "and HTTP request telemetry" to the list. (Leorio P3, Gon Nit)
🤖
Summary
http.requeststelemetry rollups for active(method, route)buckets from the existing axios logging interceptor path.coder.telemetry.local.httpRequests.windowSecondsas an advanced JSON setting with a 60s default.CoderApiclients while keeping throwaway clients opt-in.Closes #907.
Validation
pnpm test:extension ./test/unit/settings/telemetry.test.ts ./test/unit/logging/httpRequestsTelemetry.test.ts ./test/unit/api/coderApi.test.tspnpm test:extensionpnpm format:checkpnpm lintpnpm typecheckImplementation plan and decisions
Decisions
coder.telemetry.localis treated as advanced local telemetry runtime configuration, not just local sink configuration.coder.telemetry.local.httpRequests.windowSecondswith a default of60.CoderApiinstances remain telemetry-free by default; long-lived primary and remote-scoped API clients receive telemetry.Implementation
HttpRequestsTelemetryfor tumbling-window aggregation andhttp.requestsemission.TelemetryReporterinterface/no-op reporter to decoupleCoderApifrom concrete telemetry service construction.coder.telemetry.localschema and runtime readers withhttpRequests.windowSeconds.Generated by Coder Agents.