[SPARK-35357][GRAPHX] Allow to turn off the normalization applied by static PageRank utilities#32485
[SPARK-35357][GRAPHX] Allow to turn off the normalization applied by static PageRank utilities#32485ebonnal wants to merge 3 commits intoapache:masterfrom
Conversation
…nk with a 'normalized' parameter to trigger or not the normalization
|
ok to test |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #138334 has finished for PR 32485 at commit
|
|
I think it's fine. cc @srowen FYI |
srowen
left a comment
There was a problem hiding this comment.
Looks OK, only one tiny comment about 'since'
graphx/src/test/scala/org/apache/spark/graphx/lib/PageRankSuite.scala
Outdated
Show resolved
Hide resolved
|
Test build #138375 has finished for PR 32485 at commit
|
|
Thank you @Ayushsunny @HyukjinKwon @srowen for the review 🙏 . |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Merged to master |
What changes were proposed in this pull request?
Overload methods
PageRank.runWithOptionsandPageRank.runWithOptionsWithPreviousPageRank(not to break any user-facing signature) with anormalizedparameter that describes "whether or not to normalize the rank sum".Why are the changes needed?
https://issues.apache.org/jira/browse/SPARK-35357
When dealing with a non negligible proportion of sinks in a graph, algorithm based on incremental update of ranks can get a precision gain for free if they are allowed to manipulate non normalized ranks.
Does this PR introduce any user-facing change?
No
How was this patch tested?
By adding a unit test that verifies that (even when dealing with a graph containing a sink) we end up with the same result for both these scenarios:
a)
PageRank.runWithOptionswith normalization enabledb)
PageRank.runWithOptionswith normalization disabledpreRankGraph1and run 2 more iterations usingPageRank.runWithOptionsWithPreviousPageRankwith normalization disabledpreRankGraph2and run 2 more iterations usingPageRank.runWithOptionsWithPreviousPageRankwith normalization enabled