Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance enhancements #372

Closed
wants to merge 7 commits into from
Closed

Performance enhancements #372

wants to merge 7 commits into from

Conversation

InnovativeInventor
Copy link
Member

@InnovativeInventor InnovativeInventor commented Jan 2, 2022

This is a work-in-progress, draft PR with some performance enhancements. In particular,

  • __slots__ is enabled for a few frequently-created classes (Assignment and SubgraphView), resulting in minor performance improvements
  • caching for iterating over the neighbors of all flips is implemented to improve flow and cut edge calculation runtimes
  • caching for iterating over the neighbors of a node in a graph is implemented (since networkx's neighbors function is quite expensive and a partition's graph will always remain static)
  • Graph.nodes and Graph.edges is cached

Combined, these speedups improve both standard GerryChain run performance and replay performance. In particular, the time it takes to run 100 steps on PA goes from 29.8 seconds to 26 seconds (~13% decrease) and the time spent on each loop when replaying goes from 610 ms (based on #363) to 469 ms (~23% decrease).

Some caveats:

  • I used functools.cache, which is a Python 3.9 feature. We should switch to functools.lru_cache or something with better compatibility with older versions of Python before we merge this in.
  • For caching reasons, GerryChain Partitions will no longer accept a networkx graph (to convert a networkx graph to a gerrychain.Graph object, a new method Graph.from_networkx is introduced in this PR). I had to rewrite the tests accordingly.

@InnovativeInventor
Copy link
Member Author

Note that the behavior of Graph.nodes and Graph.edges will not be backwards-compatible. It may be better to rename the cached versions to Graph._cached_nodes and Graph._cached_edges.

@InnovativeInventor
Copy link
Member Author

Some more benchmark stats with 11906d0 (replay perf on 1000-step chains, PA congressional):

The FrozenGraph class also opens the door to using faster read-only data structures (generated when the class is instantiated) behind the scenes without breaking backwards-compatibility.

@InnovativeInventor
Copy link
Member Author

Updated stats (wrote some better benchmarks):

- 247 it/s (my perf tuning + Parker's slots)
- 188 it/s (Parker's slots)
- 82 it/s (main branch)

@InnovativeInventor InnovativeInventor force-pushed the perf-tuning branch 4 times, most recently from 02e5d3c to d884a4d Compare January 5, 2022 03:33
@InnovativeInventor InnovativeInventor changed the title Perf tuning Performance enhancements Jan 5, 2022
@InnovativeInventor
Copy link
Member Author

Note: I will break this up into multiple PRs, so it'll be easier to review.

@InnovativeInventor
Copy link
Member Author

I realized that I was profiling #363 incorrectly -- the performance is now as follows:

  • 353 it/s (my perf tuning + Parker's slots)
  • 194 it/s (Parker's slots)
  • 189 it/s (main branch)

My bad -- looks like __slots__ doesn't help that much after all.

@InnovativeInventor InnovativeInventor force-pushed the perf-tuning branch 3 times, most recently from e063a39 to 05233ce Compare January 6, 2022 19:12
@codecov-commenter
Copy link

Codecov Report

Merging #372 (a318d62) into main (f9f73a5) will decrease coverage by 0.05%.
The diff coverage is 70.00%.

❗ Current head a318d62 differs from pull request most recent head 05233ce. Consider uploading reports for the commit 05233ce to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main     #372      +/-   ##
==========================================
- Coverage   87.89%   87.83%   -0.06%     
==========================================
  Files          37       37              
  Lines        1660     1669       +9     
==========================================
+ Hits         1459     1466       +7     
- Misses        201      203       +2     
Impacted Files Coverage Δ
gerrychain/graph/graph.py 85.95% <66.66%> (-0.78%) ⬇️
gerrychain/partition/assignment.py 96.00% <100.00%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f9f73a5...05233ce. Read the comment docs.

@InnovativeInventor
Copy link
Member Author

Memory leak!:

@InnovativeInventor InnovativeInventor force-pushed the perf-tuning branch 2 times, most recently from 1d4a69e to e387ce4 Compare January 19, 2022 15:47
@InnovativeInventor
Copy link
Member Author

Reminder to self: Partition.crosses_parts caching increases performance (but the previous impl introduced a memory leak).

@InnovativeInventor InnovativeInventor force-pushed the perf-tuning branch 5 times, most recently from cc7be93 to db18c15 Compare January 19, 2022 16:19
InnovativeInventor and others added 3 commits January 19, 2022 12:52
Note that the "seed_and_freeze" test changed due to the way flows are
being cached. Hopefully there are no correctness issues introduced here.
@InnovativeInventor
Copy link
Member Author

Closing due to the successful merges of all the broken out PRs. The retworkx PR is still a work-in-progress, but will be ready soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants