-
Notifications
You must be signed in to change notification settings - Fork 934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: how to access credentials inside a node? #575
Comments
Hi @mnowotnik, thank you for your question. Could you explain a bit more why your ETL operations, which interface with a DB, can't use a dataset? |
interested by the answer too, what if i need credentials to use let's say a google API inside a node? how can I pass it as a param? |
Thanks for taking interest in my concern @limdauto . Moreover, I want to execute this operation specifically in the scope of a Node, as opposed to in e.g. |
I have a similar usecase as the one mentioned by @bensdm. I also could not find a built-in way of getting this data from the # credentials.yml
# (...)
app:
client_id: abc
client_secret: xyz # nodes.py
import yaml
# (...)
def get_app_credentials():
with open("./conf/local/credentials.yml") as cred:
cred_dict = yaml.safe_load(cred).get("app")
return cred_dict
def authenticate_user(credentials):
# (...)
return something # pipeline.py
# (...)
def create_pipeline(**kwargs):
return Pipeline(
[
node(
func=get_app_credentials,
inputs=None,
outputs="credentials"
),
node(
func=authenticate_user,
inputs="credentials",
outputs="something",
),
# (...)
]
) Any thoughts or comments on whether this is an appropriate work-around or not are very much welcome! |
I believe the proper way to do this is to implement or extend a custom DataSet, such as APIDataSet, but I can agree with the OP that there are other use cases, like @bensdm mentioned above, or to manage a session. Not to mentioned, it just more convenient. you could also load the credentials inside the ProjectHooks and then set them as environment variables. |
I think the idiomatic way is to have a node called Having said that, if you still want to access credentials from from kedro.framework.session import get_current_session
session = get_current_session()
context = session.load_context()
credentials = context._get_config_credentials()
# or credentials = context.config_loader.get("credentials*", "credentials*/**", "**/credentials*") But with great power comes great responsibility here. Coupling your node with the global session is only intended to be used sparingly. The last workaround is instead of using credentials, you can use parameters instead. For example: $ kedro run --params api_token=<my-api-token> And you get access to that in the node through the Hope this helps! |
Since there are a number of alternatives to accomplish what you are after, I will close this issue but please feel free to re-open it if you need more support. |
The generic solution that adapts to all type of config is as follow: Create a dataset of type CredentialsDataset, under from typing import Any, Dict
from kedro.io import AbstractDataset
class CredentialsDataset(AbstractDataset):
def __init__(self, credentials: Dict[str, Any] = None):
self._credentials = credentials
def _load(self) -> dict:
return self._credentials
def _save(self) -> None:
print("save")
def _describe(self) -> Dict[str, Any]:
return dict(credentials=self._credentials) Create the empty file In the catalog declare an entry with the credential you would like to load: my_cred_input:
type: <project>.datasets.credentials.CredentialsDataset
credentials: test_creds # name of the credential entry Then your node can use |
What are you trying to do?
I am trying to perform ETL operations inside a node. To do this, I need access to database credentials.
Workaround
As a workaround, I can add sqlalchemy engine to DataCatalog in
register_data_catalog
hook dynamically, but I don't think DataCatalog should be used in this way.The text was updated successfully, but these errors were encountered: