improvement(enrichments): limit company-info to fields both providers return#4817
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Hunter runs first (free), with People Data Labs as fallback. Reviewed by Cursor Bugbot for commit a1ff849. Bugbot is set up for automated code reviews on this repo. Configure here. |
Greptile SummaryThis PR narrows the Company Info enrichment to only the two fields both Hunter and PDL reliably return (
Confidence Score: 3/5The cascade runner stops at the first provider with any non-empty field, so a Hunter hit that has description but no size will win and leave employeeCount permanently blank — the same gap the PR aims to close. Additionally, changing employeeCount from number to string breaks any saved workflow that passes the output to numeric inputs or arithmetic. Both issues are in open review threads and remain unaddressed. Two distinct defects on the changed path remain unresolved: the partial-hit cascade silently drops employeeCount for companies Hunter partially knows, and the number-to-string type change breaks existing workflow connections wired to numeric operations. apps/sim/enrichments/company-info/company-info.ts — the cascade ordering, mapOutput logic, and output type declaration all warrant another look before merging. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Input: company domain] --> B[normalizeDomain]
B -->|empty string| Z[skip provider — return null params]
B -->|valid domain| C[Hunter: hunter_companies_find]
C -->|404| D[PDL: pdl_company_enrich]
C -->|error| E[errorCount++, try PDL]
C -->|success| F{mapOutput has any non-empty field?}
F -->|yes — Hunter wins| G[Return result: employeeCount, description]
F -->|no — both fields empty| D
E --> D
D -->|404 or skipped| H[Return empty result]
D -->|error| I[Return error if all providers errored]
D -->|success| J{mapOutput has any non-empty field?}
J -->|yes — PDL wins| G
J -->|no| H
Reviews (2): Last reviewed commit: "improvement(enrichments): limit company-..." | Re-trigger Greptile |
| providers: [ | ||
| toolProvider({ | ||
| id: 'hunter', | ||
| label: 'Hunter', | ||
| toolId: 'hunter_companies_find', | ||
| buildParams: (inputs) => { | ||
| const domain = normalizeDomain(inputs.domain) | ||
| if (!domain) return null | ||
| return { domain } | ||
| }, | ||
| mapOutput: (output) => { | ||
| return filterUndefined({ | ||
| industry: str(output.industry) || undefined, | ||
| employeeCount: str(output.size) || undefined, | ||
| foundedYear: num(output.founded_year), | ||
| description: str(output.description) || undefined, | ||
| }) | ||
| }, | ||
| }), |
There was a problem hiding this comment.
Hunter partial-hit silently blocks PDL employee count
The cascade runner (run.ts:80) stops at the first provider whose mapOutput returns any non-empty field. If Hunter finds a company record but its response has no size field, Hunter still wins (because industry, foundedYear, or description satisfies hasResult), PDL is never attempted, and employeeCount stays blank — the same symptom the PR set out to fix, just narrowed to Hunter-known companies where size is absent. In that scenario the reorder actively regresses coverage relative to the previous PDL-first order.
… return Hunter's company dataset returns null industry/foundedYear for many large companies (verified against the live API for Microsoft, Amazon, Google), so under the first-non-empty-wins cascade those columns appeared inconsistently across rows. Limit company-info outputs to employee count and description — the fields Hunter and PDL both reliably return — so every row is consistent. employeeCount is a string so Hunter's range bucket and PDL's exact count share the column.
af4a677 to
a1ff849
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit a1ff849. Configure here.
|
@greptile review |

Summary
industryandfounded_yearinconsistently across rows — Hunter's company dataset returnsnullfor those fields on many large companies (verified against the live API for Microsoft, Amazon, Google), and the first-non-empty-wins cascade meant Hunter usually won before PDL could fill them.employeeCountis a string so Hunter's range bucket (e.g."11-50") and PDL's exact count share the same column. Hunter (free) runs first, PDL is the paid fallback.Type of Change
Testing
Tested manually against the live Hunter API to confirm the field gaps are real (not a mapping bug).
bun run lintclean,bun run check:api-validation:strictpassed.Checklist