Sort nodeset on demand#330
Open
tompng wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adjusts REXML XPath evaluation to return consistently ordered node-sets (via centralized sorting) and updates tests to handle XPath primitive results returned as single-element arrays.
Changes:
- Centralizes ordering by using
XPathParser.sort(...)inmatchand some call sites, and makessorta class method. - Simplifies
step(...)by always de-duplicating via identitySetand removing theaxis_orderparameter. - Updates the Jaxen test helper to unwrap primitive XPath results before calling
REXML::Functions.string.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| test/test_jaxen.rb | Unwraps primitive XPath results (single-element arrays) to keep Functions.string calls compatible. |
| lib/rexml/xpath_parser.rb | Reworks node-set ordering and de-duplication; changes sort to a class method; removes reverse-axis handling in step. |
| lib/rexml/functions.rb | Sorts node-sets before iterating / stringifying to make results deterministic. |
Comments suppressed due to low confidence (1)
lib/rexml/xpath_parser.rb:1
stepno longer sorts the merged node-set (it used to callsort(...)for the multi-nodeset case). Returningnodes.to_amakes ordering dependent on insertion order throughSet, which can change predicate behavior where ordering is significant (e.g., later[1]filters orposition()), and can introduce non-determinism across Ruby versions/Set behavior. Consider sorting the merged result before returning (and keep de-duplication by identity).
# frozen_string_literal: false
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
155
to
161
| result = expr(path_stack, nodeset) | ||
| case result | ||
| when Array # nodeset | ||
| result.uniq | ||
| XPathParser.sort(result) | ||
| else | ||
| [result] | ||
| end |
Comment on lines
236
to
240
| when :ancestor | ||
| nodeset = step(path_stack, axis_order: :reverse) do | ||
| nodeset = step(path_stack) do | ||
| nodesets = [] | ||
| # new_nodes = {} | ||
| nodeset.each do |node| |
Comment on lines
+87
to
89
| XPathParser.sort(node_set.to_a).each do |node| | ||
| result << yield(node) if node.respond_to?(:namespace) | ||
| end |
Comment on lines
+291
to
294
| nodeset = step(path_stack) do | ||
| nodeset.map do |node| | ||
| next unless node.respond_to?(:parent) | ||
| next if node.parent.nil? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In most case, sorting nodeset is not needed. Sort is only required in:
Delay sorting, only sort when it is needed. It will normally reduce the number of
sortcall./a/b/c/d/e(a/b/c/d)[position()>1]/e/f/gnumber(/a/b/c/d/e)count(/a/b/c/d/e)//a//b//c//d//e/a[1]/b[1]/c[1]/d[1]/eIn the last example, sort call increases because this PR drops optimization of skipping sort when
nodesets.size==1and always sort the final result.To reduce more sort calls, we need to mark nodeset ordering: introducing
Nodeset = Struct.new(:nodes, :order)but IMO, it shouldn't be done now. If
sortis optimized, one extra sort won't be a problem. Optimizingstepwill be harder and the code may be complicated.Note
#315 will remove one
nodesets.size == 1optimization path. This pull request will reduce the performance regression. But, this pull request will slightly add complexity and a risk to forgot sorting the nodeset in some path.The effect may seem drastic in some case for now, but it's just because
sortis currently worstO(n^2). We can improvesortperformance, so there's an option to leave the sort strategy simple.