-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "interior point" (a.k.a. "point on surface") implementations #870
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the high quality contribution - especially the documentation and tests!
I have some nits and questions, but am otherwise very on board with getting this into geo
.
fn interior_point(&self) -> Self::Output { | ||
match self.0.len() { | ||
0 => None, | ||
1 | 2 => Some(self.0[0].into()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it were a linestring of length 2, I'd think a point halfway between the endpoints would be a better interior point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would have been my first thought too, but neither JTS nor Geos do this; they only ever return already-present vertices from lines and linestrings. I think this is because the baseline contract we're trying to uphold here is:
if let Some(point) = geom.interior_point() {
assert!(geom.intersects(&point)); // and also .contains(&point) if any such point exists
}
and for calculated midpoints, because of floating point rounding errors, this often isn't true in practice.
Consider this test script:
use geo::{Coordinate, Line, Intersects, Centroid};
use rand::Rng;
fn main() {
let mut rng = rand::thread_rng();
for _i in 0..100 {
let start: Coordinate<f64> = (rng.gen(), rng.gen()).into();
let end: Coordinate<f64> = (rng.gen(), rng.gen()).into();
let line: Line<f64> = Line::new(start, end);
let centroid = line.centroid();
println!("{:?}", line.intersects(¢roid));
}
}
in my testing, this only prints true
about half the time.
All of which is to say: I think using actual vertices is indeed safer. FWIW, what JTS says they do is to use an interior point if any, or use whichever of the endpoints is closer to the centroid if not; this seemed sort of nonsensical to me because (modulo floating point weirdness again), they should be equidistant, so I'm just always using the start point. But if there's some situation where that doesn't do the right thing, I can calculate the distances I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for explaining! What you have makes sense. Could you add a comment to that effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about it, I suppose it's possible for this same issue to come up with degenerate flat/zero-area polygons that aren't horizontal or vertical, and it may be that perfectly correct behavior would need to detect that state of affairs and return vertices from the polygon instead of midpoints/intersection points. I don't think either Geos or JTS explicitly handle that case, though, so I'm probably not inclined to try either unless you feel strongly (I'm not sure it's something that could be cheaply detected).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments about lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I lied; the polygon case isn't that hard to deal with either 😅 . Added a fallback to capture that case, and a new test.
.chain(std::iter::once(scan_line.clone())); | ||
|
||
let mut intersections: Vec<SweepPoint<T>> = Vec::new(); | ||
for (l1, l2, inter) in Intersections::from_iter(lines) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intersections
was only added very recently. Nice to see it already being re-used!
geo/src/algorithm/interior_point.rs
Outdated
lines_iter::LinesIter, relate::Relate, | ||
}; | ||
use crate::geometry::*; | ||
use crate::sweep::*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity - can you be explicit about what's being used from sweep
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed 👍
geo/src/algorithm/interior_point.rs
Outdated
let mut intersections_iter = intersections.iter().peekable(); | ||
while let (Some(start), Some(end)) = (intersections_iter.next(), intersections_iter.peek()) { | ||
let length = end.x - start.x; | ||
let midpoint = Point::new((start.x + end.x) / two, start.y); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't yet grok the overall algorithm, but it doesn't seem like this "midpoint
" is not on the line between start
and end
because of the y-coord, which reads a little strange. Can you confirm this is correct?
edited: for egregious typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a better explanation of the algorithmic approach, with a rough sketch, to the PR body above ^. But the gist of it: we're intersecting a horizontal line with the edge lines of the polygon, and then iterating over all the points where that horizontal line and the edge lines intersect (we're calculating the full intersection set, but then throwing out all the intersection points where neither of the source geometries is our horizontal split line in the preceding loop). As such, all of these points should fall along the same horizontal line, so they should all have the same Y coordinate. To maybe make this clearer, I changed the spelling in this bit of the function to use the previously-calculated Y value instead of start.y
; this should be functionally identical, but maybe makes the intent clearer?
geo/src/algorithm/interior_point.rs
Outdated
type Output = Point<T>; | ||
|
||
fn interior_point(&self) -> Self::Output { | ||
self.to_polygon() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but it seems like self.centroid()
would give an optimal result, marginally more efficiently. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for some reason I was thinking there were some possible oddly shaped triangles that didn't contain their own centroids, but that isn't in fact the case 😅 . Made this change; note that there's a test I inherited from the centroid
test set (degenerate_triangle_like_ring
) that depended on triangle.interior_point()
and triangle.to_polygon().interior_point()
having the same value, which they now no longer do, so I tweaked that test a bit as well so it now asserts a weaker thing.
geo/src/algorithm/interior_point.rs
Outdated
vec![], | ||
); | ||
let multipoly = MultiPolygon::new(vec![p1, p2]); | ||
assert!(multipoly.intersects(&multipoly.interior_point().unwrap())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intentionally omit an assert_eq in this example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unintentional; fixed 👍
geo/src/algorithm/interior_point.rs
Outdated
pub trait InteriorPoint { | ||
type Output; | ||
|
||
/// See: <https://en.wikipedia.org/wiki/InteriorPoint> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wiki page doesn't seem to exist. Is this aspirational? Or a broken link?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, my mistake. I started with the centroid codebase as a skeleton and did some up-front search-and-replace (and there is a centroid page). Just changed this to a prose description.
|
||
fn interior_point(&self) -> Self::Output { | ||
if let Some(centroid) = self.centroid() { | ||
self.iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to JTS's documention, for mixed geometry collections JTS will return a point within the geometry with the highest dimensionality.
e.g. given GEOMETRYCOLLECTION(POINT(...), POLYGON(...))
The given point will always be within the Polygon (degenerate geometries perhaps notwithstanding).
I can imagine why that's useful. Is there a reason you diverged there?
We do have a HasDimensions
which might be useful in such an implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I just didn't catch this intent on the JTS side. Updated to adopt this behavior, and added a test to capture it (mixed_collection_test
).
Co-authored-by: Michael Kirk <michael.code@endoftheworl.de>
@michaelkirk thanks for the review! I think I've addressed this round of feedback above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I'll leave it open for a couple days before merging in case someone else wants to take a look.
Thanks for the contribution!
bors r+ |
870: Add "interior point" (a.k.a. "point on surface") implementations r=michaelkirk a=apendleton - [X] I agree to follow the project's [code of conduct](https://github.com/georust/geo/blob/main/CODE_OF_CONDUCT.md). - [x] I added an entry to `CHANGES.md` if knowledge of this change could be valuable to users. --- `geo` currently has centroid implementations, but as far as I could see, doesn't have an equivalent currently to what the Geos C API and PostGIS call "point on surface," or JTS and Geos C++ call an "interior point" -- a central-ish point that's guaranteed to intersect with the geometry (and will be strictly inside it if possible). This PR adds a trait for this concept, and implementations for the various geometry types. Implementations are inspired by the prose descriptions of how these mechanisms work in JTS (e.g. [here](https://locationtech.github.io/jts/javadoc/org/locationtech/jts/algorithm/InteriorPointArea.html) and [here](https://locationtech.github.io/jts/javadoc/org/locationtech/jts/algorithm/InteriorPointLine.html)) but not really by the actual code, so probably won't produce exactly the same results all the time. I mostly copied the `centroid` tests and updated their expected results where they differed (though I did add a couple of specific tests to exercise some code paths those didn't, and deleted some that didn't really make sense, e.g., related to weighting, which doesn't really happen in this implementation). I haven't contributed to this repo before, so let me know if this is something you'd actually want (I can publish separately if not), if there are code-style concerns, if this set of tests seems reasonable, etc. Also: I went with the JTS name of this concept because "point on surface" always felt a little funky when applied to, e.g., points, but I'm not strongly attached if there's a preference otherwise (Turf has yet another name for this thing: "point on feature"). ## Algorithmic approach for polygons For polygons, we calculate the bounding box of the polygon, calculate the y-direction midpoint, and use that to draw a line across the middle of the bounding area. Conceptually, we intersect that line with the polygon, which will produce one or more horizontal lines where the original line and the polygon overlap; we take the longest of these, and use its midpoint as the interior point. <img width="509" alt="image" src="https://user-images.githubusercontent.com/78930/176797513-3c30617c-3e8e-4376-87c1-f9abae6f5219.png"> In practice, we do this by calculating the intersection of this horizontal line with all of the polygon's edges, sorting these points left to right, then iterating over them and looking at the point pair formed by each point and its neighbor to the right, which will bound a line that is either inside (solid horizontal lines, above) or outside (dotted, above) the polygon; we can sort these descending by length, and the longest whose midpoint is inside the polygon is the winner. One additional tweak: there's a particular problematic corner case where one of the edges of the polygon is collinear with this horizontal middle line. If that happens, it's possible that the longest-intersection line, and consequently its midpoint, sits on the edge of the polygon instead of in its interior. In order for such an edge to exist, at least one vertex would need to exist with Y coordinate equal to our middle Y value, so to minimize the chance of this happening, we explicitly check up front if the Y value we've selected matches any of the vertices, and if it does, we find the Y value of the vertically-next-closest vertex to the split line, and average the two, and use that Y value for our split line; this should never be a Y value shared with any points unless the polygon is flat, in which case we're going to end up with an edge point no matter what. ## Algorithmic approach for lines and linestrings For lines and linestrings, we always return a vertex from the original line. If there are non-endpoint points, we return the one closest (in terms of Euclidean distance) from the linestring's centroid. If not, we return the first point. Co-authored-by: Andrew Pendleton <andrew.pendleton@sofarocean.com> Co-authored-by: Andrew Pendleton <abpend@gmail.com>
Build failed: |
@michaelkirk sorry for the noise, didn't realize clippy was part of the CI suite. Fixed if you want to run bors again. |
Bors retry |
870: Add "interior point" (a.k.a. "point on surface") implementations r=michaelkirk a=apendleton - [X] I agree to follow the project's [code of conduct](https://github.com/georust/geo/blob/main/CODE_OF_CONDUCT.md). - [x] I added an entry to `CHANGES.md` if knowledge of this change could be valuable to users. --- `geo` currently has centroid implementations, but as far as I could see, doesn't have an equivalent currently to what the Geos C API and PostGIS call "point on surface," or JTS and Geos C++ call an "interior point" -- a central-ish point that's guaranteed to intersect with the geometry (and will be strictly inside it if possible). This PR adds a trait for this concept, and implementations for the various geometry types. Implementations are inspired by the prose descriptions of how these mechanisms work in JTS (e.g. [here](https://locationtech.github.io/jts/javadoc/org/locationtech/jts/algorithm/InteriorPointArea.html) and [here](https://locationtech.github.io/jts/javadoc/org/locationtech/jts/algorithm/InteriorPointLine.html)) but not really by the actual code, so probably won't produce exactly the same results all the time. I mostly copied the `centroid` tests and updated their expected results where they differed (though I did add a couple of specific tests to exercise some code paths those didn't, and deleted some that didn't really make sense, e.g., related to weighting, which doesn't really happen in this implementation). I haven't contributed to this repo before, so let me know if this is something you'd actually want (I can publish separately if not), if there are code-style concerns, if this set of tests seems reasonable, etc. Also: I went with the JTS name of this concept because "point on surface" always felt a little funky when applied to, e.g., points, but I'm not strongly attached if there's a preference otherwise (Turf has yet another name for this thing: "point on feature"). ## Algorithmic approach for polygons For polygons, we calculate the bounding box of the polygon, calculate the y-direction midpoint, and use that to draw a line across the middle of the bounding area. Conceptually, we intersect that line with the polygon, which will produce one or more horizontal lines where the original line and the polygon overlap; we take the longest of these, and use its midpoint as the interior point. <img width="509" alt="image" src="https://user-images.githubusercontent.com/78930/176797513-3c30617c-3e8e-4376-87c1-f9abae6f5219.png"> In practice, we do this by calculating the intersection of this horizontal line with all of the polygon's edges, sorting these points left to right, then iterating over them and looking at the point pair formed by each point and its neighbor to the right, which will bound a line that is either inside (solid horizontal lines, above) or outside (dotted, above) the polygon; we can sort these descending by length, and the longest whose midpoint is inside the polygon is the winner. One additional tweak: there's a particular problematic corner case where one of the edges of the polygon is collinear with this horizontal middle line. If that happens, it's possible that the longest-intersection line, and consequently its midpoint, sits on the edge of the polygon instead of in its interior. In order for such an edge to exist, at least one vertex would need to exist with Y coordinate equal to our middle Y value, so to minimize the chance of this happening, we explicitly check up front if the Y value we've selected matches any of the vertices, and if it does, we find the Y value of the vertically-next-closest vertex to the split line, and average the two, and use that Y value for our split line; this should never be a Y value shared with any points unless the polygon is flat, in which case we're going to end up with an edge point no matter what. ## Algorithmic approach for lines and linestrings For lines and linestrings, we always return a vertex from the original line. If there are non-endpoint points, we return the one closest (in terms of Euclidean distance) from the linestring's centroid. If not, we return the first point. Co-authored-by: Andrew Pendleton <andrew.pendleton@sofarocean.com> Co-authored-by: Andrew Pendleton <abpend@gmail.com>
Build failed: |
... apparently clippy introduced a |
Bors retry |
Build succeeded: |
CHANGES.md
if knowledge of this change could be valuable to users.geo
currently has centroid implementations, but as far as I could see, doesn't have an equivalent currently to what the Geos C API and PostGIS call "point on surface," or JTS and Geos C++ call an "interior point" -- a central-ish point that's guaranteed to intersect with the geometry (and will be strictly inside it if possible).This PR adds a trait for this concept, and implementations for the various geometry types. Implementations are inspired by the prose descriptions of how these mechanisms work in JTS (e.g. here and here) but not really by the actual code, so probably won't produce exactly the same results all the time.
I mostly copied the
centroid
tests and updated their expected results where they differed (though I did add a couple of specific tests to exercise some code paths those didn't, and deleted some that didn't really make sense, e.g., related to weighting, which doesn't really happen in this implementation).I haven't contributed to this repo before, so let me know if this is something you'd actually want (I can publish separately if not), if there are code-style concerns, if this set of tests seems reasonable, etc. Also: I went with the JTS name of this concept because "point on surface" always felt a little funky when applied to, e.g., points, but I'm not strongly attached if there's a preference otherwise (Turf has yet another name for this thing: "point on feature").
Algorithmic approach for polygons
For polygons, we calculate the bounding box of the polygon, calculate the y-direction midpoint, and use that to draw a line across the middle of the bounding area. Conceptually, we intersect that line with the polygon, which will produce one or more horizontal lines where the original line and the polygon overlap; we take the longest of these, and use its midpoint as the interior point.
In practice, we do this by calculating the intersection of this horizontal line with all of the polygon's edges, sorting these points left to right, then iterating over them and looking at the point pair formed by each point and its neighbor to the right, which will bound a line that is either inside (solid horizontal lines, above) or outside (dotted, above) the polygon; we can sort these descending by length, and the longest whose midpoint is inside the polygon is the winner.
One additional tweak: there's a particular problematic corner case where one of the edges of the polygon is collinear with this horizontal middle line. If that happens, it's possible that the longest-intersection line, and consequently its midpoint, sits on the edge of the polygon instead of in its interior. In order for such an edge to exist, at least one vertex would need to exist with Y coordinate equal to our middle Y value, so to minimize the chance of this happening, we explicitly check up front if the Y value we've selected matches any of the vertices, and if it does, we find the Y value of the vertically-next-closest vertex to the split line, and average the two, and use that Y value for our split line; this should never be a Y value shared with any points unless the polygon is flat, in which case we're going to end up with an edge point no matter what.
Algorithmic approach for lines and linestrings
For lines and linestrings, we always return a vertex from the original line. If there are non-endpoint points, we return the one closest (in terms of Euclidean distance) from the linestring's centroid. If not, we return the first point.