Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polygon in Rect performance improvements #1192

Merged
merged 4 commits into from
Jun 25, 2024
Merged

Conversation

michel-kraemer
Copy link
Contributor

@michel-kraemer michel-kraemer commented Jun 14, 2024

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

This PR improves the performance of the Polygon in Rect check:

Before:

Rect contains polygon   time:   [2.3345 µs 2.3514 µs 2.3662 µs]
                        change: [-1.3280% -0.6558% -0.0257%] (p = 0.05 < 0.05)
                        Change within noise threshold.

After:

Rect contains polygon   time:   [3.7123 ns 3.7174 ns 3.7237 ns]
                        change: [-99.841% -99.840% -99.839%] (p = 0.00 < 0.05)
                        Performance has improved.

I have an application where I need to perform this check millions of times and this small improvement shaves off several seconds for me.

I think the same can be done for the other geometry types (LineString in Rect, MultiPolygon in Rect, etc.), but I wanted to get your feedback on this one first.

@michaelkirk
Copy link
Member

michaelkirk commented Jun 14, 2024

I suspect you're right that some of our predicates could be optimized for Rects, but Contains is a notoriously tricky predicate.

A couple of cases that seem like they might be wrong with your implementation:

  1. What about an empty polygon? By definition at least one point of P needs to be in the interior of R. That seems easy enough to guard against.

  2. More tricky: What about something like P = R.to_polygon(). I'd expect your implementation to return false, because you're only checking the polygons vertices, and all the Polygon's vertices are in the boundary of R, with none in the interior of R.

  3. Edit: another degenerate case to watch out for would be a flat P, living in the boundary of R. I think currently you're handling this, but be sure not to break it if you find a way to address point 2.

@michel-kraemer
Copy link
Contributor Author

Thank you so much! This is exactly the kind of feedback I was hoping for! 👍

You're right that my first implementation was inconsistent to what relate does. I did not consider the empty case, and due to a typo in one of my unit tests, I was under the impression that geometries have to lie completely inside the interior of the rectangle.

I had a look at the DE-9IM again and AFAICT I've now implemented it correctly.

  1. None of the polygon's points may lie outside the rectangle (i.e. there must be no intersection between the rectangle's exterior and the polygon's boundary)
  2. The polygon must not completely lie inside the rectangle's boundary (i.e. the interiors of the rectangle and the polygon must intersect). Since we know that the rectangle is convex, it is enough to make sure that either at least one of the polygon's points lie inside the rectangle's interior or the polygon has an area != 0.

The new implementation is slightly slower than my first try, but still much faster than the original one:

Before:

Rect contains polygon   time:   [2.3159 µs 2.3249 µs 2.3345 µs]
                        change: [-1.3476% -0.8085% -0.3250%] (p = 0.00 < 0.05)
                        Change within noise threshold.

After:

Rect contains polygon   time:   [7.4262 ns 7.4409 ns 7.4631 ns]
                        change: [-99.679% -99.677% -99.676%] (p = 0.00 < 0.05)
                        Performance has improved.

Copy link
Member

@michaelkirk michaelkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've done it, by jove.

I'm going to leave this open for a couple days in case anyone else wants to weigh in. This stuff is really tricky. But this is a big win!

@michaelkirk
Copy link
Member

michaelkirk commented Jun 15, 2024

I wonder if using p.bounding_rect().coords_iter() instead of p.exterior_coords would eke out a little more performance? Probably not much more, but maybe a little. It also might be a good way to generalize a solution for all the other geometries too.

edit: For the record, I'd be more than happy to merge this as is, but just food for thought if you're thinking about pursuing more work like this.

@michel-kraemer
Copy link
Contributor Author

michel-kraemer commented Jun 15, 2024

Thanks for the review. I've integrated your comments.

I wonder if using p.bounding_rect().coords_iter() instead of p.exterior_coords would eke out a little more performance? Probably not much more, but maybe a little. It also might be a good way to generalize a solution for all the other geometries too.

Maybe ... but then we'd have to go through the points again to check if they lie inside the rectangle's interior. The way it is now, we can just use one for loop.

edit: For the record, I'd be more than happy to merge this as is, but just food for thought if you're thinking about pursuing more work like this.

Yes, it would be fun to optimize this predicate for the other geometry types as well. If it's not too much trouble for you, I'd prefer if you could merge this PR first. I'll create a new PR for the other geometries in the next few weeks.

@urschrei
Copy link
Member

This is a delightful improvement!

@michaelkirk michaelkirk added this pull request to the merge queue Jun 25, 2024
Merged via the queue into georust:main with commit 0ee6b34 Jun 25, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants