Skip to content

Sort nodeset on demand#330

Open
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:sort_on_demand
Open

Sort nodeset on demand#330
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:sort_on_demand

Conversation

@tompng

@tompng tompng commented Jun 9, 2026

Copy link
Copy Markdown
Member

In most case, sorting nodeset is not needed. Sort is only required in:

  • Final result
  • Creating nodesets(each nodeset should be axis-ordered) from a single nodeset
    • Ideally, this can be skipped if the following predicate is not position-dependent
  • Nodeset passed to a function (first node in document order is used)

Delay sorting, only sort when it is needed. It will normally reduce the number of sort call.

XPath Number of sort (master) Number of sort (this PR)
/a/b/c/d/e 3 1
(a/b/c/d)[position()>1]/e/f/g 5 2
number(/a/b/c/d/e) 3 1
count(/a/b/c/d/e) 3 0
//a//b//c//d//e 8 1
/a[1]/b[1]/c[1]/d[1]/e 0 1

In the last example, sort call increases because this PR drops optimization of skipping sort when nodesets.size==1 and always sort the final result.

To reduce more sort calls, we need to mark nodeset ordering: introducing Nodeset = Struct.new(:nodes, :order)
but IMO, it shouldn't be done now. If sort is optimized, one extra sort won't be a problem. Optimizing step will be harder and the code may be complicated.

Note

#315 will remove one nodesets.size == 1 optimization path. This pull request will reduce the performance regression. But, this pull request will slightly add complexity and a risk to forgot sorting the nodeset in some path.
The effect may seem drastic in some case for now, but it's just because sort is currently worst O(n^2). We can improve sort performance, so there's an option to leave the sort strategy simple.

Copilot AI review requested due to automatic review settings June 9, 2026 17:06

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adjusts REXML XPath evaluation to return consistently ordered node-sets (via centralized sorting) and updates tests to handle XPath primitive results returned as single-element arrays.

Changes:

  • Centralizes ordering by using XPathParser.sort(...) in match and some call sites, and makes sort a class method.
  • Simplifies step(...) by always de-duplicating via identity Set and removing the axis_order parameter.
  • Updates the Jaxen test helper to unwrap primitive XPath results before calling REXML::Functions.string.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
test/test_jaxen.rb Unwraps primitive XPath results (single-element arrays) to keep Functions.string calls compatible.
lib/rexml/xpath_parser.rb Reworks node-set ordering and de-duplication; changes sort to a class method; removes reverse-axis handling in step.
lib/rexml/functions.rb Sorts node-sets before iterating / stringifying to make results deterministic.
Comments suppressed due to low confidence (1)

lib/rexml/xpath_parser.rb:1

  • step no longer sorts the merged node-set (it used to call sort(...) for the multi-nodeset case). Returning nodes.to_a makes ordering dependent on insertion order through Set, which can change predicate behavior where ordering is significant (e.g., later [1] filters or position()), and can introduce non-determinism across Ruby versions/Set behavior. Consider sorting the merged result before returning (and keep de-duplication by identity).
# frozen_string_literal: false

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/rexml/xpath_parser.rb
Comment on lines 155 to 161
result = expr(path_stack, nodeset)
case result
when Array # nodeset
result.uniq
XPathParser.sort(result)
else
[result]
end
Comment thread lib/rexml/xpath_parser.rb
Comment on lines 236 to 240
when :ancestor
nodeset = step(path_stack, axis_order: :reverse) do
nodeset = step(path_stack) do
nodesets = []
# new_nodes = {}
nodeset.each do |node|
Comment thread lib/rexml/functions.rb
Comment on lines +87 to 89
XPathParser.sort(node_set.to_a).each do |node|
result << yield(node) if node.respond_to?(:namespace)
end
Comment thread lib/rexml/xpath_parser.rb
Comment on lines +291 to 294
nodeset = step(path_stack) do
nodeset.map do |node|
next unless node.respond_to?(:parent)
next if node.parent.nil?
@tompng tompng mentioned this pull request Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants