Skip to content

fix(langchain): prefer structured tool inputs#1719

Merged
hassiebp merged 1 commit into
langfuse:mainfrom
vismaytiwari:fix-langchain-tool-structured-inputs
Jun 23, 2026
Merged

fix(langchain): prefer structured tool inputs#1719
hassiebp merged 1 commit into
langfuse:mainfrom
vismaytiwari:fix-langchain-tool-structured-inputs

Conversation

@vismaytiwari

@vismaytiwari vismaytiwari commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes LangChain tool observations storing the stringified tool input when LangChain also provides structured tool inputs.

For structured tools, LangChain passes both input_str and the original structured payload in kwargs["inputs"] to on_tool_start. The callback handler previously always used input_str, which can be a Python str(dict) representation. For complex tool inputs, such as multiline content with quotes or JSON-like text, that representation is not reliable JSON and can show up poorly in Langfuse.

This updates on_tool_start to prefer kwargs["inputs"] when it is available, while keeping the existing input_str fallback for callbacks that do not provide structured inputs.

Fixes langfuse/langfuse#14026

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Refactor
  • Documentation update
  • Tooling, CI, or repo maintenance

Verification

List the main commands you ran:

uv run --frozen pytest tests/unit/test_langchain.py::test_tool_start_prefers_structured_inputs_when_available
uv run --frozen pytest tests/unit/test_langchain.py
uv run --frozen ruff check langfuse/langchain/CallbackHandler.py tests/unit/test_langchain.py
uv run --frozen ruff format --check langfuse/langchain/CallbackHandler.py tests/unit/test_langchain.py
uv run --frozen mypy langfuse --no-error-summary
uv run --frozen ruff check .
uv run --frozen pytest -n auto --dist worksteal tests/unit

I also reproduced the issue locally before the fix with an in-memory exporter. Before this change, the exported tool observation input was a Python dict string and failed JSON parsing:

RAW_INPUT: {'path': '/tmp/example.md', 'content': '# Title\n\nThis has \'quotes\' and JSON-like text: {"a": 1}'}
PARSE_ERROR: JSONDecodeError Expecting property name enclosed in double quotes

After the change, the same repro exports structured JSON and parses back to the original input payload:

MATCHES_STRUCTURED_INPUTS: True

Full unit result:

549 passed, 2 skipped

Checklist

  • I self-reviewed the diff using code_review.md.
  • I added or updated tests for behavior changes.
  • I updated docs, examples, or .env.template if needed — N/A, this is covered by the LangChain callback unit test and does not change setup or public examples.
  • I did not hand-edit generated files; if generated files changed, I used the upstream regeneration path.
  • I did not commit secrets or credentials.

Greptile Summary

This PR fixes on_tool_start in the LangChain callback handler to prefer the structured kwargs["inputs"] dict over the Python-stringified input_str when recording tool observation inputs, with a fallback to input_str when no structured input is provided.

  • The three-line change in CallbackHandler.py picks up kwargs["inputs"] first and only falls back to input_str, preventing unreliable str(dict) representations (with single quotes, newlines, or embedded JSON) from reaching Langfuse.
  • A new unit test in test_langchain.py covers the prefer-structured-inputs path end-to-end using a payload that would have failed JSON parsing under the old behavior.

Confidence Score: 4/5

The change is a safe, minimal addition with a correct fallback; the only rough edge is that structured inputs end up stored redundantly in metadata, which doesn't break anything but creates noise in observations.

The fix correctly resolves a real serialisation problem and the new test covers the targeted scenario well. The one gap is that kwargs["inputs"] continues to be swept into metadata by the unconditional meta.update(kwargs) loop that already existed, so structured inputs will appear twice in every affected observation — once as input and once as metadata["inputs"]. This is a minor quality issue rather than a correctness bug, and it does not affect the main goal of the PR.

The meta.update block in langfuse/langchain/CallbackHandler.py is worth revisiting to exclude the inputs key when it is promoted to the primary input field.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant LC as LangChain Runtime
    participant CB as CallbackHandler
    participant LF as Langfuse Observation

    LC->>CB: "on_tool_start(serialized, input_str, inputs=structured_dict)"
    CB->>CB: "tool_input = kwargs["inputs"] (structured dict)"
    note over CB: fallback to input_str if inputs absent
    CB->>CB: "meta = build_metadata(kwargs excl. inputs)"
    CB->>LF: "start_observation(input=tool_input, metadata=meta)"
    LC->>CB: on_tool_end(output)
    CB->>LF: "update_observation(output=output)"
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant LC as LangChain Runtime
    participant CB as CallbackHandler
    participant LF as Langfuse Observation

    LC->>CB: "on_tool_start(serialized, input_str, inputs=structured_dict)"
    CB->>CB: "tool_input = kwargs["inputs"] (structured dict)"
    note over CB: fallback to input_str if inputs absent
    CB->>CB: "meta = build_metadata(kwargs excl. inputs)"
    CB->>LF: "start_observation(input=tool_input, metadata=meta)"
    LC->>CB: on_tool_end(output)
    CB->>LF: "update_observation(output=output)"
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
langfuse/langchain/CallbackHandler.py:974-983
When `kwargs["inputs"]` is present it gets stored twice in the observation: once as the primary `input` field (via `tool_input`) and again as `metadata["inputs"]` via the unconditional `meta.update(kwargs)` at line 974. This means every tool run with structured inputs will have a redundant copy of the payload in metadata, wasting storage and making the observation noisier. The `inputs` key should be excluded from the metadata when it is used as the primary input.

```suggestion
            tool_input = kwargs.get("inputs")
            if tool_input is None:
                tool_input = input_str

            meta.update(
                {
                    key: value
                    for key, value in kwargs.items()
                    if value is not None and key != "inputs"
                }
            )

            observation_type = self._get_observation_type_from_serialized(
                serialized, "tool", **kwargs
            )
```

Reviews (1): Last reviewed commit: "fix(langchain): prefer structured tool i..." | Re-trigger Greptile

@vismaytiwari vismaytiwari marked this pull request as ready for review June 22, 2026 13:18

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@hassiebp hassiebp left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

@hassiebp hassiebp merged commit 2791fda into langfuse:main Jun 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(sdk-python): LangChain tool observations should use structured inputs when available

2 participants