Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces new DPO (Direct Preference Optimization) training examples for both Tinker and Twinkle clients and adds multi-modal training references to the documentation. Existing self-cognition scripts and documentation have been updated to use batch_encode instead of encode for consistency. The documentation for Tinker and Twinkle clients has been significantly revised to provide more streamlined training loop examples. Review feedback focuses on improving performance by removing redundant manual tensor-to-list conversions, utilizing built-in .to_numpy() methods for efficiency, and ensuring consistent API usage regarding result retrieval and metric logging across all examples.
There was a problem hiding this comment.
Pull request overview
This PR updates Twinkle’s English documentation and cookbook examples to reflect recent API/example changes for client usage, including encoding helpers and new training recipes.
Changes:
- Update examples to use
Template.batch_encode(...)[0]instead ofTemplate.encode(...)in several docs/cookbook snippets. - Refine Twinkle Client documentation and training example (imports, server capability listing, training loop, optimizer notes).
- Add new DPO training cookbook scripts for both Twinkle Client and Tinker-compatible client (ModelScope).
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/source_en/Usage Guide/Train-as-a-Service.md | Updates encoding example to use batch_encode and index the first item. |
| docs/source_en/Usage Guide/Server and Client/Twinkle-Client.md | Updates migration guidance and expands the end-to-end remote training example. |
| docs/source_en/Usage Guide/Server and Client/Tinker-Compatible-Client.md | Simplifies training example and updates inference sampling to use Template + batch_encode. |
| docs/source_en/Usage Guide/Server and Client/Overview.md | Refreshes the cookbook directory tree to match current scripts (including new DPO entries). |
| cookbook/client/twinkle/modelscope/dpo.py | Adds a Twinkle-client DPO training example (ModelScope). |
| cookbook/client/tinker/self_host/self_cognition.py | Updates eval encoding call to batch_encode. |
| cookbook/client/tinker/modelscope/self_cognition.py | Updates eval encoding call to batch_encode. |
| cookbook/client/tinker/modelscope/dpo.py | Adds a Tinker-compatible DPO training example (ModelScope), including optional SwanLab logging. |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
PR type
PR information
Write the detail information belongs to this PR.
Experiment results
Paste your experiment result here(if needed).