-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arbitrary saving location of child catalogs #1116
Comments
I'd agree that it should be possible to allow that multiple entities can refer to a catalog. So something like |
Related: radiantearth/stac-spec#1230 |
PR is at #1117 |
Sorry, I'm reopening here because #1117 actually doesn't fully solve the issue. It helps with the parent links, but the storage location is still showing unexpected results (i.e. it depends on the order of the root.add_child calls). More specifically, I'd assume that this code: import pystac
root = pystac.Catalog("root", "root")
left = pystac.Catalog("left", "left")
right = pystac.Catalog("right", "right")
root.add_child(left)
root.add_child(right)
child = pystac.Catalog("child", "child")
left.add_child(child)
right.add_child(child, keep_parent=True)
root.normalize_and_save("out_test", pystac.CatalogType.ABSOLUTE_PUBLISHED) Stores the child in left (as the parent is left), but it stores it in right. I assume the serialization goes from "root" to the items and whichever catalog lists the child catalog first, in that location it is stored. |
I've looked into the code a bit and it looks difficult ;-) Some findings:
I guess both issues need to be solved for this, but I'm not sure yet how to do this. Edit: I started an attempt in PR #1118. It fixes the issue discussed here, but as a couple of tests fail on my machine also on main it's hard to further work on it as I need to rely on the CI. I'm not sure how to proceed. If someone has an idea, I'd be very happy. :-) |
left.add_child(child)
right.add_child(child, keep_parent=True) I'm starting to wonder if we're outrunning the spec here. My preferred solution would be to disallow multiple parents (radiantearth/stac-spec#1230), and then throw an error on the second |
I don't think we do. Isn't that exactly what browsable encourages (in the API world)? And isn't it also a common usecase do combine multiple data sources into one catalog? |
It seems strange to me to have a child link in one direction, but no parent link back. That feels very un-browseable. But maybe I'm misunderstanding the use case. ---
title: This seems bad
---
flowchart TD
A <-- child+parent --> C
B -- child --> C
|
Yes, but then we are looking at multiple parents again, see radiantearth/stac-spec#1230. I think, the example mentioned in radiantearth/stac-spec#1230 (comment) is a very valid use case for a static catalog as it makes the data easier to find by users (imaging hundreds of collections) by providing different access paths. And I thought people are actually encouraged to do this (see Browsable Extension). At least @cholmes always tried to push people towards browsable catalogs and multiple access paths. In an API it's simpler of course, as you can update the parent depending on where you are coming from, but in static it's impossible. And in an API you may also argue that it's strange to have an item with an id have different (parent) links depending on the access path. You won't have one source of truth then. So if you are saying that this is invalid, we'd basically say Browsable should be removed as an extension as this won't work then (assuming we don't want to duplicate data). By the way, I've seen this approach with links to one entity in more implementations (e.g. FedEO), but the issue never came up until now because they are all APIs and we are obviously the first to try this for static catalogs with PySTAC. The structure for us is set, but maybe PySTAC is not the right tool for us? |
Not saying this is invalid, just saying that it scans strange to me. But I see the use case and it seems reasonable enough.
Nah, it should be simple enough to add a child w/o setting the |
…arent override of href (#1155) * Don't override parent link when determining hrefs #1116 * Add an attribute to the class to mark if it is an orphan * Use @m-mohr's suggestion for when to set `self_href` * Add test based on example * Change `keep_parent` to `set_parent` * Add note about self_link to docstrings --------- Co-authored-by: Matthias Mohr <webmaster@mamo-net.de>
It is hard to control the actual storage location of child catalogs when they are child of more than one catalog. Consider the following example:
Whether the "child" catalog is saved under
left/child
orright/child
depends on the order "left" and "right" are added to "root" (right/child
in this case, as "right" was added after "left").Additionally, when I change the order of the catalog addition:
Then there is no parent link from "child" to "right" at all, even when this should be the case as the
add_child
on "right" was called last:The text was updated successfully, but these errors were encountered: