feat: batch event ingestion on POST /track via { type: 'batch' } envelope#377
feat: batch event ingestion on POST /track via { type: 'batch' } envelope#377mraj602-tohands wants to merge 11 commits into
Conversation
Reimplements the batch ingestion feature from PR Openpanel-dev#374 on top of current upstream/main (post buffer-perf, kafka client, ClickHouse round-robin). Adds __syncedAt property to all worker-processed events. See PR description for full architectural details. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds POST /track batch ingestion with envelope validation, pooled per-request context (geo/salts), per-item validation and dispatch, bounded concurrency (50), and aggregated 202 responses reporting accepted and per-index rejected reasons. ChangesBatch event ingestion with per-event validation and bounded concurrency
sequenceDiagram
participant Client
participant Handler
participant buildSharedRequestContext
participant geo_lookup
participant buildEventContext
participant dispatchEvent
participant queueAdd
participant upsertProfile
Client->>Handler: POST /track (single or batch)
Handler->>buildSharedRequestContext: extract projectId, headers, IP -> salts
buildSharedRequestContext->>geo_lookup: lookup(request IP)
geo_lookup-->>buildSharedRequestContext: geo
loop per event
Handler->>buildEventContext: sharedCtx + raw event
alt __ip override differs
buildEventContext->>geo_lookup: lookup(__ip)
geo_lookup-->>buildEventContext: geo
end
buildEventContext-->>dispatchEvent: TrackContext
dispatchEvent->>queueAdd: enqueue track jobs
dispatchEvent->>upsertProfile: identify -> upsertProfile
end
Handler-->>Client: 200 (single) or 202 {accepted, rejected[]}
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
apps/worker/src/jobs/events.incoming-events.test.ts (1)
553-596: 💤 Low valueConsider adding test coverage for historical events when lock is not acquired.
This test verifies historical event handling when the lock is acquired (session_start emitted). Consider adding a companion test where
getLockreturnsfalseto verify that:
session_startis skipped- The event itself is still created
sessionEndis still not scheduled (historical path)This would complete coverage of the historical event path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/worker/src/jobs/events.incoming-events.test.ts` around lines 553 - 596, Add a new test alongside the existing historical-event test that simulates the lock NOT being acquired by mocking getLock to return false; call incomingEvent with the same historical payload and assert that createEvent is still called for the event (but there is no 'session_start' createEvent call), sessionsQueue.add is not called (no sessionEnd scheduled), and that the event createEvent call preserves deviceId/sessionId; reference the incomingEvent handler, getLock mock, createEvent mock, and sessionsQueue.add to locate and update the test harness.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/worker/src/jobs/events.incoming-event.ts`:
- Around line 217-219: The referrer fields (enrichment.referrer,
enrichment.referrer_name, enrichment.referrer_type) use "?? undefined" which can
yield undefined instead of falling back to the base event; change them to use
the same fallback pattern as the other fields by replacing "enrichment.referrer
?? undefined" with "enrichment.referrer ?? baseEvent.referrer" and likewise
"enrichment.referrer_name ?? baseEvent.referrer_name" and
"enrichment.referrer_type ?? baseEvent.referrer_type" so string-type
expectations are preserved.
---
Nitpick comments:
In `@apps/worker/src/jobs/events.incoming-events.test.ts`:
- Around line 553-596: Add a new test alongside the existing historical-event
test that simulates the lock NOT being acquired by mocking getLock to return
false; call incomingEvent with the same historical payload and assert that
createEvent is still called for the event (but there is no 'session_start'
createEvent call), sessionsQueue.add is not called (no sessionEnd scheduled),
and that the event createEvent call preserves deviceId/sessionId; reference the
incomingEvent handler, getLock mock, createEvent mock, and sessionsQueue.add to
locate and update the test harness.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 91718804-8af2-4a6e-9663-17127061acf7
📒 Files selected for processing (9)
apps/api/src/controllers/event.controller.tsapps/api/src/controllers/track.controller.tsapps/api/src/routes/track-batch.router.test.tsapps/api/src/routes/track.router.tsapps/api/src/utils/ids.tsapps/worker/src/jobs/events.incoming-event.tsapps/worker/src/jobs/events.incoming-events.test.tspackages/queue/src/queues.tspackages/validation/src/track.validation.ts
💤 Files with no reviewable changes (1)
- packages/queue/src/queues.ts
Tests use toStrictEqual, so the new __syncedAt property in event properties needs to be included in assertions. Uses expect.any(String) since the exact ISO timestamp varies per run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hment When session enrichment has null referrer fields, fall back to baseEvent values instead of undefined. Consistent with all other enrichment fields (path, os, browser, geo, etc.). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Hi @lindesvard, just following up on this PR in case it was missed. Happy to make any changes or adjustments if needed. Would appreciate a review when you get a chance 🙂 |
|
Hey, today our session management is very time dependent, so batching events might cause issues here. I have started working on improving this locally and will probably need to merge that before I can merge this (this will avoid your changes in the incoming-events file etc) Secondly I think we should just piggy back on our existing /track endpoint. It takes a generic body This would make proxying etc easier since we dont need to add yet another endpoint to proxy. So what I would like you todo is to:
|
…changes Per review feedback on Openpanel-dev#377: - Remove the separate /track/batch endpoint; POST /track now accepts { "type": "batch", "payload": [event, ...] } alongside single events, so proxies only ever deal with one endpoint. - Revert all worker/session changes (events.incoming-event.ts, queue payload, ids.ts, event.controller.ts) back to upstream behavior — session handling improvements will land separately. - Batch items reuse the existing per-event pipeline unchanged, including the isTimestampFromThePast flag semantics; per-item validation errors are reported per-index in rejected[] without failing the whole batch. - Raise the /track body limit to 10 MB to fit max-size batches. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tch arms Adds batch coverage for group/assign_group/increment/decrement/replay dispatch, per-event __ip geo override, non-object items, single-event alias 400, and queue-failure internal rejections. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/api/src/controllers/track.controller.ts (2)
594-606:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMap
rejected[]to the public response shape instead of returning internal state.This currently pushes the full
BatchItemResultobjects into the response, so each rejected item includesstatus: 'rejected'.apps/api/src/routes/track.router.tsdocuments the 202 body as{ index, reason, error }, so the controller should emit that exact shape instead of relying on serializer behavior.Suggested fix
let accepted = 0; - const rejected: Extract<BatchItemResult, { status: 'rejected' }>[] = []; + const rejected: Array<{ + index: number; + reason: 'validation' | 'internal'; + error: string; + }> = []; for (const result of results) { if (result.status === 'accepted') { accepted += 1; } else { - rejected.push(result); + rejected.push({ + index: result.index, + reason: result.reason, + error: result.error, + }); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 594 - 606, The current loop pushes full BatchItemResult objects into the rejected array and returns them in the 202 response; update the controller (the loop that iterates over results and the rejected array sent in reply.status(202).send) to map each rejected item to the public shape { index, reason, error } instead of the internal object (i.e., when result.status !== 'accepted' push or collect { index: result.index, reason: result.reason, error: result.error }), then send that mapped array in the response so it matches the documented contract.
156-158:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve the “no user-agent => no identity” behavior.
apps/api/src/utils/ids.tsexplicitly returns emptydeviceId/sessionIdwhenuais missing. Replacing a missing header with'unknown/1.0'here changes that contract and can collapse unrelated server-side traffic behind the same IP into the same device/session.Suggested fix
interface SharedRequestContext { projectId: string; requestIp: string; - requestUa: string; + requestUa?: string; requestHeaders: Record<string, string | undefined>; requestGeo: GeoLocation; salts: Salts; } @@ - const requestIp = request.clientIp; - const requestUa = request.headers['user-agent'] ?? 'unknown/1.0'; - const requestHeaders = getStringHeaders(request.headers); + const requestIp = request.clientIp; + const requestHeaders = getStringHeaders(request.headers); + const requestUa = requestHeaders['user-agent'];🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 156 - 158, The code currently forces a default user-agent string by setting requestUa = request.headers['user-agent'] ?? 'unknown/1.0', which breaks the contract in ids.ts that expects a missing UA to yield empty deviceId/sessionId; change this to preserve "no UA => no identity" by leaving requestUa undefined/null when the header is absent (e.g., requestUa = request.headers['user-agent'] as string | undefined) and ensure downstream uses of requestUa (the identity generation code that calls into ids.ts) continue to treat undefined/empty as "no UA" rather than a placeholder.
🧹 Nitpick comments (2)
apps/api/src/routes/track.router.ts (1)
37-37: 💤 Low valueConsider splitting the long description for readability.
The description string is comprehensive but spans a very long single line. While this works fine for OpenAPI documentation, breaking it into a template literal with multiple lines would improve source readability.
♻️ Optional refactor for readability
- description: `Ingest a tracking event (track, identify, group, increment, decrement, replay) or a batch of events ({ "type": "batch", "payload": [event, ...] }). Batch requests accept up to ${TRACK_BATCH_MAX_EVENTS} events and 10MB uncompressed per request; each event is dispatched through the same pipeline as a single-event request. Per-event validation failures are returned in the rejected[] array — the whole batch does not fail on a single bad row.`, + description: `Ingest a tracking event (track, identify, group, increment, decrement, replay) or a batch of events. + Batch requests use { "type": "batch", "payload": [event, ...] } and accept up to ${TRACK_BATCH_MAX_EVENTS} events and 10MB uncompressed per request. + Each event is dispatched through the same pipeline as a single-event request. + Per-event validation failures are returned in the rejected[] array — the whole batch does not fail on a single bad row.`,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/track.router.ts` at line 37, The long single-line OpenAPI description on the route config should be split across multiple template-literal lines for readability: update the description property in track.router.ts (the route's description string that references TRACK_BATCH_MAX_EVENTS) to a multi-line template literal or joined string segments so the content flows across lines while preserving the embedded ${TRACK_BATCH_MAX_EVENTS} interpolation and exact wording; locate the description key in the route definition and break the string into readable lines (or an array .join('')) without changing the text or interpolation semantics.apps/api/src/routes/track-batch.router.test.ts (1)
204-359: ⚡ Quick winConsider adding test coverage for the 5-day historical floor.
The test suite covers historical timestamps (two hours ago), but the PR objectives state: "Enforce a 5-day historical acceptance window" with "older events rejected per-row." There's no test verifying that events with timestamps older than 5 days are rejected with a per-item validation error.
📋 Suggested test case
it('rejects events older than 5 days per-item', async () => { const sixDaysAgo = Date.now() - 6 * 24 * 60 * 60 * 1000; const res = await postBatch([ validTrack('recent'), { type: 'track' as const, payload: { name: 'ancient_event', properties: { __timestamp: new Date(sixDaysAgo).toISOString() }, }, }, ]); expect(res.statusCode).toBe(202); const body = res.json(); expect(body.accepted).toBe(1); expect(body.rejected).toHaveLength(1); expect(body.rejected[0]).toMatchObject({ index: 1, reason: 'validation', }); expect(body.rejected[0].error).toMatch(/timestamp|past|day/i); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/track-batch.router.test.ts` around lines 204 - 359, Add a test that verifies the 5-day historical floor by calling postBatch with one valid track (use validTrack) and one track whose payload.properties.__timestamp is older than 5 days (e.g., Date.now() - 6*24*60*60*1000), then assert the response is 202, accepted is 1, rejected length is 1, and the rejected item matches { index: 1, reason: 'validation' } and its error message matches a timestamp/past/day regex; name the test "rejects events older than 5 days per-item" and place it alongside the other batch tests so it exercises the same pipeline (postBatch, validTrack).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/api/src/controllers/track.controller.ts`:
- Around line 109-128: The change removed the 5-day cutoff and now accepts
arbitrarily old __timestamp values; reintroduce a FIVE_DAYS_MS constant (e.g., 5
* 24 * 60 * 60 * 1000) and add an early check using clientTimestampNumber and
safeTimestamp: if clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS, return
the per-item rejection path required by the batch contract (the same
rejection/validation response your controller uses elsewhere) before the
existing ONE_MINUTE_MS and FIFTEEN_MINUTES_MS logic so old events are rejected
instead of persisted.
---
Outside diff comments:
In `@apps/api/src/controllers/track.controller.ts`:
- Around line 594-606: The current loop pushes full BatchItemResult objects into
the rejected array and returns them in the 202 response; update the controller
(the loop that iterates over results and the rejected array sent in
reply.status(202).send) to map each rejected item to the public shape { index,
reason, error } instead of the internal object (i.e., when result.status !==
'accepted' push or collect { index: result.index, reason: result.reason, error:
result.error }), then send that mapped array in the response so it matches the
documented contract.
- Around line 156-158: The code currently forces a default user-agent string by
setting requestUa = request.headers['user-agent'] ?? 'unknown/1.0', which breaks
the contract in ids.ts that expects a missing UA to yield empty
deviceId/sessionId; change this to preserve "no UA => no identity" by leaving
requestUa undefined/null when the header is absent (e.g., requestUa =
request.headers['user-agent'] as string | undefined) and ensure downstream uses
of requestUa (the identity generation code that calls into ids.ts) continue to
treat undefined/empty as "no UA" rather than a placeholder.
---
Nitpick comments:
In `@apps/api/src/routes/track-batch.router.test.ts`:
- Around line 204-359: Add a test that verifies the 5-day historical floor by
calling postBatch with one valid track (use validTrack) and one track whose
payload.properties.__timestamp is older than 5 days (e.g., Date.now() -
6*24*60*60*1000), then assert the response is 202, accepted is 1, rejected
length is 1, and the rejected item matches { index: 1, reason: 'validation' }
and its error message matches a timestamp/past/day regex; name the test "rejects
events older than 5 days per-item" and place it alongside the other batch tests
so it exercises the same pipeline (postBatch, validTrack).
In `@apps/api/src/routes/track.router.ts`:
- Line 37: The long single-line OpenAPI description on the route config should
be split across multiple template-literal lines for readability: update the
description property in track.router.ts (the route's description string that
references TRACK_BATCH_MAX_EVENTS) to a multi-line template literal or joined
string segments so the content flows across lines while preserving the embedded
${TRACK_BATCH_MAX_EVENTS} interpolation and exact wording; locate the
description key in the route definition and break the string into readable lines
(or an array .join('')) without changing the text or interpolation semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0fd25152-7171-48d5-a4a0-65a06d2b317f
📒 Files selected for processing (4)
apps/api/src/controllers/track.controller.tsapps/api/src/routes/track-batch.router.test.tsapps/api/src/routes/track.router.tspackages/validation/src/track.validation.ts
💤 Files with no reviewable changes (1)
- packages/validation/src/track.validation.ts
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/api/src/controllers/track.controller.ts (2)
594-606:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMap
rejected[]to the public response shape instead of returning internal state.This currently pushes the full
BatchItemResultobjects into the response, so each rejected item includesstatus: 'rejected'.apps/api/src/routes/track.router.tsdocuments the 202 body as{ index, reason, error }, so the controller should emit that exact shape instead of relying on serializer behavior.Suggested fix
let accepted = 0; - const rejected: Extract<BatchItemResult, { status: 'rejected' }>[] = []; + const rejected: Array<{ + index: number; + reason: 'validation' | 'internal'; + error: string; + }> = []; for (const result of results) { if (result.status === 'accepted') { accepted += 1; } else { - rejected.push(result); + rejected.push({ + index: result.index, + reason: result.reason, + error: result.error, + }); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 594 - 606, The current loop pushes full BatchItemResult objects into the rejected array and returns them in the 202 response; update the controller (the loop that iterates over results and the rejected array sent in reply.status(202).send) to map each rejected item to the public shape { index, reason, error } instead of the internal object (i.e., when result.status !== 'accepted' push or collect { index: result.index, reason: result.reason, error: result.error }), then send that mapped array in the response so it matches the documented contract.
156-158:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve the “no user-agent => no identity” behavior.
apps/api/src/utils/ids.tsexplicitly returns emptydeviceId/sessionIdwhenuais missing. Replacing a missing header with'unknown/1.0'here changes that contract and can collapse unrelated server-side traffic behind the same IP into the same device/session.Suggested fix
interface SharedRequestContext { projectId: string; requestIp: string; - requestUa: string; + requestUa?: string; requestHeaders: Record<string, string | undefined>; requestGeo: GeoLocation; salts: Salts; } @@ - const requestIp = request.clientIp; - const requestUa = request.headers['user-agent'] ?? 'unknown/1.0'; - const requestHeaders = getStringHeaders(request.headers); + const requestIp = request.clientIp; + const requestHeaders = getStringHeaders(request.headers); + const requestUa = requestHeaders['user-agent'];🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 156 - 158, The code currently forces a default user-agent string by setting requestUa = request.headers['user-agent'] ?? 'unknown/1.0', which breaks the contract in ids.ts that expects a missing UA to yield empty deviceId/sessionId; change this to preserve "no UA => no identity" by leaving requestUa undefined/null when the header is absent (e.g., requestUa = request.headers['user-agent'] as string | undefined) and ensure downstream uses of requestUa (the identity generation code that calls into ids.ts) continue to treat undefined/empty as "no UA" rather than a placeholder.
🧹 Nitpick comments (2)
apps/api/src/routes/track.router.ts (1)
37-37: 💤 Low valueConsider splitting the long description for readability.
The description string is comprehensive but spans a very long single line. While this works fine for OpenAPI documentation, breaking it into a template literal with multiple lines would improve source readability.
♻️ Optional refactor for readability
- description: `Ingest a tracking event (track, identify, group, increment, decrement, replay) or a batch of events ({ "type": "batch", "payload": [event, ...] }). Batch requests accept up to ${TRACK_BATCH_MAX_EVENTS} events and 10MB uncompressed per request; each event is dispatched through the same pipeline as a single-event request. Per-event validation failures are returned in the rejected[] array — the whole batch does not fail on a single bad row.`, + description: `Ingest a tracking event (track, identify, group, increment, decrement, replay) or a batch of events. + Batch requests use { "type": "batch", "payload": [event, ...] } and accept up to ${TRACK_BATCH_MAX_EVENTS} events and 10MB uncompressed per request. + Each event is dispatched through the same pipeline as a single-event request. + Per-event validation failures are returned in the rejected[] array — the whole batch does not fail on a single bad row.`,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/track.router.ts` at line 37, The long single-line OpenAPI description on the route config should be split across multiple template-literal lines for readability: update the description property in track.router.ts (the route's description string that references TRACK_BATCH_MAX_EVENTS) to a multi-line template literal or joined string segments so the content flows across lines while preserving the embedded ${TRACK_BATCH_MAX_EVENTS} interpolation and exact wording; locate the description key in the route definition and break the string into readable lines (or an array .join('')) without changing the text or interpolation semantics.apps/api/src/routes/track-batch.router.test.ts (1)
204-359: ⚡ Quick winConsider adding test coverage for the 5-day historical floor.
The test suite covers historical timestamps (two hours ago), but the PR objectives state: "Enforce a 5-day historical acceptance window" with "older events rejected per-row." There's no test verifying that events with timestamps older than 5 days are rejected with a per-item validation error.
📋 Suggested test case
it('rejects events older than 5 days per-item', async () => { const sixDaysAgo = Date.now() - 6 * 24 * 60 * 60 * 1000; const res = await postBatch([ validTrack('recent'), { type: 'track' as const, payload: { name: 'ancient_event', properties: { __timestamp: new Date(sixDaysAgo).toISOString() }, }, }, ]); expect(res.statusCode).toBe(202); const body = res.json(); expect(body.accepted).toBe(1); expect(body.rejected).toHaveLength(1); expect(body.rejected[0]).toMatchObject({ index: 1, reason: 'validation', }); expect(body.rejected[0].error).toMatch(/timestamp|past|day/i); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/track-batch.router.test.ts` around lines 204 - 359, Add a test that verifies the 5-day historical floor by calling postBatch with one valid track (use validTrack) and one track whose payload.properties.__timestamp is older than 5 days (e.g., Date.now() - 6*24*60*60*1000), then assert the response is 202, accepted is 1, rejected length is 1, and the rejected item matches { index: 1, reason: 'validation' } and its error message matches a timestamp/past/day regex; name the test "rejects events older than 5 days per-item" and place it alongside the other batch tests so it exercises the same pipeline (postBatch, validTrack).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/api/src/controllers/track.controller.ts`:
- Around line 109-128: The change removed the 5-day cutoff and now accepts
arbitrarily old __timestamp values; reintroduce a FIVE_DAYS_MS constant (e.g., 5
* 24 * 60 * 60 * 1000) and add an early check using clientTimestampNumber and
safeTimestamp: if clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS, return
the per-item rejection path required by the batch contract (the same
rejection/validation response your controller uses elsewhere) before the
existing ONE_MINUTE_MS and FIFTEEN_MINUTES_MS logic so old events are rejected
instead of persisted.
---
Outside diff comments:
In `@apps/api/src/controllers/track.controller.ts`:
- Around line 594-606: The current loop pushes full BatchItemResult objects into
the rejected array and returns them in the 202 response; update the controller
(the loop that iterates over results and the rejected array sent in
reply.status(202).send) to map each rejected item to the public shape { index,
reason, error } instead of the internal object (i.e., when result.status !==
'accepted' push or collect { index: result.index, reason: result.reason, error:
result.error }), then send that mapped array in the response so it matches the
documented contract.
- Around line 156-158: The code currently forces a default user-agent string by
setting requestUa = request.headers['user-agent'] ?? 'unknown/1.0', which breaks
the contract in ids.ts that expects a missing UA to yield empty
deviceId/sessionId; change this to preserve "no UA => no identity" by leaving
requestUa undefined/null when the header is absent (e.g., requestUa =
request.headers['user-agent'] as string | undefined) and ensure downstream uses
of requestUa (the identity generation code that calls into ids.ts) continue to
treat undefined/empty as "no UA" rather than a placeholder.
---
Nitpick comments:
In `@apps/api/src/routes/track-batch.router.test.ts`:
- Around line 204-359: Add a test that verifies the 5-day historical floor by
calling postBatch with one valid track (use validTrack) and one track whose
payload.properties.__timestamp is older than 5 days (e.g., Date.now() -
6*24*60*60*1000), then assert the response is 202, accepted is 1, rejected
length is 1, and the rejected item matches { index: 1, reason: 'validation' }
and its error message matches a timestamp/past/day regex; name the test "rejects
events older than 5 days per-item" and place it alongside the other batch tests
so it exercises the same pipeline (postBatch, validTrack).
In `@apps/api/src/routes/track.router.ts`:
- Line 37: The long single-line OpenAPI description on the route config should
be split across multiple template-literal lines for readability: update the
description property in track.router.ts (the route's description string that
references TRACK_BATCH_MAX_EVENTS) to a multi-line template literal or joined
string segments so the content flows across lines while preserving the embedded
${TRACK_BATCH_MAX_EVENTS} interpolation and exact wording; locate the
description key in the route definition and break the string into readable lines
(or an array .join('')) without changing the text or interpolation semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0fd25152-7171-48d5-a4a0-65a06d2b317f
📒 Files selected for processing (4)
apps/api/src/controllers/track.controller.tsapps/api/src/routes/track-batch.router.test.tsapps/api/src/routes/track.router.tspackages/validation/src/track.validation.ts
💤 Files with no reviewable changes (1)
- packages/validation/src/track.validation.ts
🛑 Comments failed to post (1)
apps/api/src/controllers/track.controller.ts (1)
109-128:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winRestore the 5-day rejection path for historical events.
This now accepts any past
__timestampand only marks it as historical after 15 minutes. That breaks the batch contract in this PR and will persist arbitrarily old events instead of rejecting them per item.Suggested fix
const ONE_MINUTE_MS = 60 * 1000; + const FIVE_DAYS_MS = 5 * 24 * 60 * ONE_MINUTE_MS; const FIFTEEN_MINUTES_MS = 15 * ONE_MINUTE_MS; // Use safeTimestamp if invalid or more than 1 minute in the future if ( Number.isNaN(clientTimestampNumber) || clientTimestampNumber > safeTimestamp + ONE_MINUTE_MS ) { return { timestamp: safeTimestamp, isTimestampFromThePast: false }; } + + if (clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS) { + throw new HttpError('Timestamp is older than 5 days', { status: 400 }); + } // isTimestampFromThePast is true only if timestamp is older than 15 minutes const isTimestampFromThePast = clientTimestampNumber < safeTimestamp - FIFTEEN_MINUTES_MS;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.// Constants for time validation const ONE_MINUTE_MS = 60 * 1000; const FIVE_DAYS_MS = 5 * 24 * 60 * ONE_MINUTE_MS; const FIFTEEN_MINUTES_MS = 15 * ONE_MINUTE_MS; // Use safeTimestamp if invalid or more than 1 minute in the future if ( Number.isNaN(clientTimestampNumber) || clientTimestampNumber > safeTimestamp + ONE_MINUTE_MS ) { return { timestamp: safeTimestamp, isTimestampFromThePast: false }; } if (clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS) { throw new HttpError('Timestamp is older than 5 days', { status: 400 }); } // isTimestampFromThePast is true only if timestamp is older than 15 minutes const isTimestampFromThePast = clientTimestampNumber < safeTimestamp - FIFTEEN_MINUTES_MS; return { timestamp: clientTimestampNumber, isTimestampFromThePast, };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 109 - 128, The change removed the 5-day cutoff and now accepts arbitrarily old __timestamp values; reintroduce a FIVE_DAYS_MS constant (e.g., 5 * 24 * 60 * 60 * 1000) and add an early check using clientTimestampNumber and safeTimestamp: if clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS, return the per-item rejection path required by the batch contract (the same rejection/validation response your controller uses elsewhere) before the existing ONE_MINUTE_MS and FIFTEEN_MINUTES_MS logic so old events are rejected instead of persisted.
|
@lindesvard Reworked as requested:
Happy to adjust the caps (2000 events / 10 MB) or the response shape if you'd prefer different numbers. |
|
I have merged my PR now, I see that you have a new merge conflict (probably after my PR) |
Resolve the single conflict in track.controller.ts caused by upstream's new session management (Openpanel-dev#393): - Adopt the PR's shared `dispatchEvent(validatedBody, context)` call in `handler` (keeps single-event and batch paths from drifting). - Fold main's `replay` session-id backcompat into `dispatchEvent`'s replay case: prefer the SDK-echoed `payload.sessionId`, else the server-derived `context.sessionId` — mirroring the single-event /track behaviour main introduced. The merge also adopts main's getDeviceId(eventTimeMs) + getOverrideDeviceId into buildEventContext (no batch behaviour change). getTimestamp is identical to main; no 5-day floor reintroduced (this PR remains transport-only). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 202 batch response pushed the internal BatchItemResult (carrying status: 'rejected') into rejected[]. The route's 202 schema already strips the extra field on the wire, but this makes the controller emit the documented shape directly instead of relying on serializer behaviour. Addresses CodeRabbit's review on PR Openpanel-dev#377. No wire change: existing SDK consumers (openpanel_flutter, lvgl-cpp-example) read only index/reason/ error and ignore status. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After merging main's new session management, POST /track runs the
`duplicateHook` (a 100 ms body-hash dedup) as a preValidation hook. It
only exempted `replay`, so a `{ type: 'batch' }` envelope was subject to
dedup: an offline-first SDK retrying a whole batch within the window would
get the entire batch silently dropped and answered with `200 'Duplicate
event'` instead of the `202 { accepted, rejected }` shape its consumers
parse.
Skip the dedup for batch envelopes too (origin-bearing clients only ever
triggered it; server-side IoT traffic without an origin header was already
safe). Adds a unit test covering batch/replay skips and the single-event
dedup path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- The 10 MB bodyLimit is route-wide; correct the comment that implied single events were unaffected (their cap is raised above Fastify's default — harmless, but worth stating accurately). - Break the long single-line OpenAPI description into multiple lines for readability (no wording/interpolation change). Addresses CodeRabbit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Rebased onto current
On the #373 acceptance criteria around historical sessions / the 5-day floor: this PR is intentionally transport-only — those were deferred to the session-management rework, which has since landed as #393. Because the PR now sits on top of #393, batched historical events already get a deterministic session derived from their own |
|
@lindesvard resolved the conflict and rebased on top of #393 — should be good to merge. |
Closes #373. Supersedes #374.
Summary
POST /trackaccepts a new body shape alongside the existing single-event one:{ "type": "batch", "payload": [ { "type": "track", "payload": { ... } }, ... ] }Up to 2000 events / 10 MB per request. Each item is dispatched through the exact same per-event pipeline as a single-event request, so offline-capable SDKs (mobile, IoT, edge) can drain a local buffer in one round-trip. No new route.
Behavior
Response. Single events keep their
200 { deviceId, sessionId }. Batch envelopes return202 { accepted, rejected[] }.Per-row rejection, not whole-batch failure. The envelope is validated loosely (
payload: unknown[], 1–2000 items); each item issafeParsed throughzTrackHandlerPayloadin the controller. Failures become{ index, reason: 'validation' | 'internal', error }— one malformed row in a 2000-event batch doesn't fail the other 1999, and the caller can re-send just the bad indices.aliasis rejected per-row with the same message as the single-event 400.Salts + geo fetched once per request.
buildSharedRequestContextdoes the salts query and request-IP geo lookup once;buildEventContextreuses them per item. Only events overriding__iptrigger a second geo lookup — for a 2000-event batch that's 1 PG round-trip + 1 geo lookup instead of 2000 of each.Bounded concurrency. Items are processed 50 at a time so an unbounded
Promise.allover 2000 events can't spike Redis/geo budgets on smaller self-hosted instances. Per-itemindexis preserved across chunks.Caps. 2000 events / 10 MB mirrors Mixpanel
/import. The 10 MB cap uses Fastify'sbodyLimit; it's route-wide, so single-event posts share the same ceiling (harmless — a single event is tiny).Batch retries aren't deduped. Batch envelopes are exempt from the request dedup hook, so an offline-first SDK retrying a whole buffered batch isn't silently dropped.
Shared dispatch. The single-event
switchlives in a shareddispatchEvent(with aneverexhaustiveness check) so single and batch paths can't drift.