Reduce hashing during v2 transitive graph walks#4109
Merged
stuhood merged 2 commits intoDec 1, 2016
Merged
Conversation
…d a faster hashcode impl for HydratedTarget.
JieGhost
approved these changes
Dec 1, 2016
JieGhost
left a comment
Contributor
There was a problem hiding this comment.
Looks good! Remind me of a similar change I made in fs.py. What I found there is same to your finding, ie, using set is much faster than OrderedSet.
lenucksi
pushed a commit
to lenucksi/pants
that referenced
this pull request
Apr 25, 2017
### Problem Transitive graph walks in the v2 engine involve doing lots of deduping merges of collections of `HydratedTarget` objects. Pre-change, this was taking up about 70% of the total runtime of `./pants --enable-v2-engine list $target` for a highly connected `$target`. ### Solution OrderedSet is significantly slower for individual dedupe operations (particularly when it is converted back into a tuple afterward), so we switch to deduping a generator using a throwaway set and collecting it into a tuple. Additionally, because `HydratedTarget` objects are guaranteed to have an Address, we implement equality/hash checks using an Address lifted from the inner structs. ### Result The runtime of `./pants --enable-v2-engine list $target` is improved by approximately 2x.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Transitive graph walks in the v2 engine involve doing lots of deduping merges of collections of
HydratedTargetobjects. Pre-change, this was taking up about 70% of the total runtime of./pants --enable-v2-engine list $targetfor a highly connected$target.Solution
OrderedSet is significantly slower for individual dedupe operations (particularly when it is converted back into a tuple afterward), so we switch to deduping a generator using a throwaway set and collecting it into a tuple.
Additionally, because
HydratedTargetobjects are guaranteed to have an Address, we implement equality/hash checks using an Address lifted from the inner structs.Result
The runtime of
./pants --enable-v2-engine list $targetis improved by approximately 2x.