Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use root StacIO in Link.resolve_stac_object when possible #742

Closed
duckontheweb opened this issue Feb 2, 2022 · 11 comments · Fixed by #762
Closed

Use root StacIO in Link.resolve_stac_object when possible #742

duckontheweb opened this issue Feb 2, 2022 · 11 comments · Fixed by #762

Comments

@duckontheweb
Copy link
Contributor

duckontheweb commented Feb 2, 2022

This came up in the context of working with links in pystac-client, but I am opening this here since the potential fix would need to happen in PySTAC.

Use-case

The following code creates a pystac_client.Client instance that will pass a key query parameter containing an API key with every request:

from pystac_client import Client

client = Client.open("https://api.radiant.earth/mlhub/v1", parameters={"key": "..."})

labels_collection = client.get_collection("ref_landcovernet_v1_labels")
first_item = next(labels_collection.get_items())
first_source_link = first_item.get_single_link(rel="source")
first_source_link
# <Link rel=source target=http://api.radiant.earth/mlhub/v1/collections/ref_landcovernet_v1_source/items/ref_landcovernet_v1_source_38PKT_29_20180101>
first_source_link.resolve_stac_object()
# This results in a 401 because the "key" parameter is not sent with the request

The key parameter is not sent because client._stac_io is not being used in the requests made by Link.resolve_stac_object.

Proposed Solution

The current method for determining which StacIO instance to use in Link.resolve_stac_object is:

  1. If root argument is not None, use root._stac_io
  2. If the Link has an owner and that owner is a pystac.Catalog, use that Catalog's _stac_io (if it is not None)
  3. Fall back to using StacIO.default() if none of the above resolve a StacIO instance

I propose adding a step between 2 & 3 that uses self.owner.get_root()._stac_io if it is not None. It seems reasonable to assume that if a StacIO instance is attached to a root catalog then all children of that catalog will want to use the same StacIO instance.

@duckontheweb
Copy link
Contributor Author

cc: @matthewhanson

@duckontheweb
Copy link
Contributor Author

cc: @KennSmithDS

@duckontheweb duckontheweb modified the milestones: 1.4.0, 1.5.0 Feb 16, 2022
@KennSmithDS
Copy link
Contributor

@duckontheweb asking a few clarifying questions on the proposed change request.

  1. For the first condition you stated if stac_io is not None, but I believe that should be if root is not None. Reason being on line 289, stac_io is set to None regardless, and then assings for a root._stac_io.
  2. I want to verify the condition for a new step between 2 & 3 is all three below:
    a. stac_io is None
    b. self.owner is not None
    c. not isinstance(self.owner, pystac.Catalog)

@duckontheweb
Copy link
Contributor Author

duckontheweb commented Mar 8, 2022

  1. For the first condition you stated if stac_io is not None, but I believe that should be if root is not None. Reason being on line 289, stac_io is set to None regardless, and then assings for a root._stac_io.

Yes, sorry about that, I just corrected the original description.

2. I want to verify the condition for a new step between 2 & 3 is all three below:
a. stac_io is None
b. self.owner is not None
c. not isinstance(self.owner, pystac.Catalog)

Looking at this again, I think the simplest solution might be to add an else clause to this statement, something like:

if self.owner is not None:
    if isinstance(self.owner, pystac.Catalog):
        stac_io = self.owner._stac_io
    else:
        root = self.owner.get_root()
        if root is not None:
            stac_io = root._stac_io

I'm a little torn on whether we should overwrite root here or use a new variable (e.g. _root). If we overwrite root, then it will set that as the root of the resolved STACObject here. I think this is what we want, but it would be good to get other opinions. @gadomski @lossyrob what are your thoughts on this?

@KennSmithDS
Copy link
Contributor

KennSmithDS commented Mar 8, 2022

@duckontheweb that's similar to the implementation I have on my local branch, but is there a possibility that even though self.owner is not None that stac_io would still be returned as None by one of the if/else conditions nested?

if stac_io is None:
      if self.owner is not None:
          if isinstance(self.owner, pystac.Catalog):
              stac_io = self.owner._stac_io
          else:
              stac_io = self.owner.get_root()._stac_io
      else:
          stac_io = pystac.StacIO.default()

@duckontheweb
Copy link
Contributor Author

@duckontheweb that's similar to the implementation I have on my local branch, but is there a possibility that even though self.owner is not None that stac_io would still be returned as None by one of the if/else conditions nested?

if stac_io is None:
      if self.owner is not None:
          if isinstance(self.owner, pystac.Catalog):
              stac_io = self.owner._stac_io
          else:
              stac_io = self.owner.get_root()._stac_io
      else:
          stac_io = pystac.StacIO.default()

Yes, good point. I think changing that else statement to another if statement would do the trick:

if stac_io is None:
    if self.owner is not None:
        if isinstance(self.owner, pystac.Catalog):
            stac_io = self.owner._stac_io
        else:
            stac_io = self.owner.get_root()._stac_io
    if stac_io is None:
        stac_io = pystac.StacIO.default()

Seems a little ugly, maybe there's a more concise way that I'm not seeing 🤷🏻

@gadomski
Copy link
Member

gadomski commented Mar 9, 2022

I'm a little torn on whether we should overwrite root here or use a new variable (e.g. _root). If we overwrite root, then it will set that as the root of the resolved STACObject here.

My instinct is to not overwrite, on the off chance that someone/somecode is relying on the passed in root parameter being set as object's root -- it seems surprising (to me) to silently change which root is set. This is just thinking from a read-through perspective, I don't have great context ATM on the broader implications.

@lossyrob
Copy link
Member

lossyrob commented Mar 9, 2022

Seems like it needs one more guard, in the case that there is owner with no root set:

if stac_io is None:
    if self.owner is not None:
        if isinstance(self.owner, pystac.Catalog):
            stac_io = self.owner._stac_io
        else:
            owner_root = self.owner.get_root()
            if owner_root is not None:
                stac_io = owner_root._stac_io
    if stac_io is None:
        stac_io = pystac.StacIO.default()

But otherwise that logic seems sound to me.

@duckontheweb
Copy link
Contributor Author

I'm a little torn on whether we should overwrite root here or use a new variable (e.g. _root). If we overwrite root, then it will set that as the root of the resolved STACObject here.

My instinct is to not overwrite, on the off chance that someone/somecode is relying on the passed in root parameter being set as object's root -- it seems surprising (to me) to silently change which root is set. This is just thinking from a read-through perspective, I don't have great context ATM on the broader implications.

Yeah, I agree, let's not overwrite it. It looks like @lossyrob 's suggestion covers everything.

@KennSmithDS
Copy link
Contributor

KennSmithDS commented Mar 10, 2022

if stac_io is None:
    if self.owner is not None:
        if isinstance(self.owner, pystac.Catalog):
            stac_io = self.owner._stac_io
        else:
            owner_root = self.owner.get_root()
            if owner_root is not None:
                stac_io = owner_root._stac_io
    if stac_io is None:
        stac_io = pystac.StacIO.default()

@lossyrob @duckontheweb it appears with this code change, the PR is failing CI unit tests because of StackOverflow, i.e. maximum recursion depth is being reached.

RecursionError: maximum recursion depth exceeded while calling a Python object

I think this might be occurring when the Link.owner root is itself, as it gets stuck in an infinite loop between:

owner_root = self.owner.get_root() [from pystac.Link.resolve_stac_object()]

and

root_link.resolve_stac_object() [from pystac.STACObject.get_root()]

Is there a way to easily check if the root of self.owner is itself before calling self.owner.get_root()?

@lossyrob
Copy link
Member

@KennSmithDS just took a look - seems like the issue is that when resolving a root link, it shouldn't go looking for the owner's root - because that's what it's currently resolving!

Avoiding the owner root check with:

if stac_io is None:
    if self.owner is not None:
        if isinstance(self.owner, pystac.Catalog):
            stac_io = self.owner._stac_io
        elif self.rel != pystac.RelType.ROOT:
            owner_root = self.owner.get_root()
            if owner_root is not None:
                stac_io = owner_root._stac_io
    if stac_io is None:
        stac_io = pystac.StacIO.default()

passes tests. Made a suggestion on the PR with that change.

@duckontheweb duckontheweb linked a pull request Mar 10, 2022 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants