test: expand real-world JSON corpus with production API samples#158
Conversation
Add 3 new fixtures to improve coverage of common production API patterns: - citm_catalog.json (1.7MB): simdjson benchmark, event ticketing with deep nesting and unicode - k8s_openapi.json (924KB): Kubernetes OpenAPI spec, $ref-heavy schema - github_prs.json (295KB): GitHub REST API PR responses, many optional fields All fixtures added to manifest.json with appropriate checks and CI gates. Document fixture sources and licenses in README.md. Closes #152
|
Warning Review limit reached
More reviews will be available in 53 minutes and 32 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThree new production API test fixtures are added to expand real-world JSON corpus coverage: ChangesTest Fixture Expansion
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/fixtures/manifest.json`:
- Around line 155-169: The manifest entry for dataset id "github_prs" contains a
real user identifier at check path "[0].user.login"; update the underlying
fixture data (github_prs.json) to redact any user/account fields (e.g., replace
"membphis" with a neutral placeholder like "redacted_user_0") and then update
the corresponding check in the checks array (the check with "path":
"[0].user.login") to expect the new sanitized value; also scan the same fixture
for other user/account fields and sanitize them and their checks similarly so no
real PII remains.
In `@tests/fixtures/README.md`:
- Line 117: Update the README entry for the fixture named `github_prs.json` to
state the concrete redistribution basis instead of the vague phrase "Public API
response, no PII": either replace that cell with the specific license/terms that
permit storing/redistributing the captured GitHub REST API v3 response (e.g.,
GitHub Terms of Service section X, or an explicit CC/BSD-like license applied to
the fixture) or remove/replace `github_prs.json` with a fixture that has clear
redistribution rights; ensure the README row for `github_prs.json` references
the exact terms or the alternative fixture name so reviewers can verify legal
permissibility.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f0651692-a29f-4657-9e60-1db845badbde
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
tests/fixtures/README.mdtests/fixtures/data/github_prs.jsontests/fixtures/data/k8s_openapi.jsontests/fixtures/manifest.json
| "id": "github_prs", | ||
| "path": "tests/fixtures/data/github_prs.json", | ||
| "source": "GitHub REST API v3 (public repo, MIT)", | ||
| "payload_type": "rest_api", | ||
| "format": "json", | ||
| "size_bytes": 294616, | ||
| "structural_density": "medium", | ||
| "workloads": ["parse_access", "decode_access"], | ||
| "ci": ["pr", "scheduled"], | ||
| "checks": [ | ||
| { "path": "", "type": "array", "len": 15 }, | ||
| { "path": "[0].number", "type": "number", "value": 157 }, | ||
| { "path": "[0].state", "type": "string", "value": "closed" }, | ||
| { "path": "[0].user.login", "type": "string", "value": "membphis" }, | ||
| { "path": "[0].base.repo.full_name", "type": "string", "value": "api7/lua-qjson" } |
There was a problem hiding this comment.
Sanitize the GitHub user identifiers before checking this fixture in.
Line 168 hard-codes a real user.login value ("membphis"), so this corpus now contains a public user identifier. That conflicts with issue #152's requirement to sanitize PII and makes the fixture unsuitable as a “no PII” sample. Please redact user/account fields in tests/fixtures/data/github_prs.json and update the checks to match the sanitized values.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/fixtures/manifest.json` around lines 155 - 169, The manifest entry for
dataset id "github_prs" contains a real user identifier at check path
"[0].user.login"; update the underlying fixture data (github_prs.json) to redact
any user/account fields (e.g., replace "membphis" with a neutral placeholder
like "redacted_user_0") and then update the corresponding check in the checks
array (the check with "path": "[0].user.login") to expect the new sanitized
value; also scan the same fixture for other user/account fields and sanitize
them and their checks similarly so no real PII remains.
Address review feedback: replace vague "Public API response, no PII" with explicit reference to GitHub Terms of Service.
Summary
Expand fixture coverage to include common production API response patterns, improving confidence in real-world compatibility.
Changes
Add 3 new fixtures representing distinct API patterns:
citm_catalog.jsonk8s_openapi.json$ref-heavy, recursive structuresgithub_prs.jsonAcceptance Criteria
tests/fixtures/manifest.jsonwith appropriate checkstests/fixtures/README.mdTesting
cargo test --release --test manifest_fixturesCloses #152
Summary by CodeRabbit
Documentation
Tests