-
Notifications
You must be signed in to change notification settings - Fork 937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template filepaths with command line arguments #602
Comments
I can get this working (on 0.16.6 at least) by doing the following: # conf/base/catalog.yml
example_iris_data:
type: pandas.CSVDataSet
filepath: data/${folder_name}/iris.csv # src/<project_name>/hooks.py
@hook_impl
def register_config_loader(self, conf_paths: Iterable[str]) -> ConfigLoader:
click_ctx = click.get_current_context(silent=True)
return TemplatedConfigLoader(conf_paths, globals_dict={
"folder_name": click_ctx.params.get("params").get("folder_name")
}) and: $ kedro run --params folder_name:non-existent-folder-name-cause-error
<snip>
[Errno 2] No such file or directory: '[redacted]/data/non-existent-folder-name-cause-error/iris.csv' you can also add your own CLI option to your Is this roughly what you were looking for? |
Also, another option - with the flexibility of # src/<project_name>/hooks.py
@hook_impl
def register_config_loader(self, conf_paths: Iterable[str]) -> ConfigLoader:
return TemplatedConfigLoader(conf_paths, globals_dict={
"folder_name": os.getenv("FOLDER_NAME")
}) and having |
Alternatively I often use steel_toes for things like this. It can append a branch name to the filepath. for instance I like @mzjp2's solution better for your use case, but this is an alternative. |
Thanks a lot @mzjp2 . |
In my opinion it depends on where you are running. For instance if you are deploying with many of the various docker options changing the yaml requires a new image, but most of the time there there is a way to set ENV_VARS for deployment. |
Ah I see. Thank you. |
Just want to point out that the solution in the thread using CONFIG_LOADER_ARGS = {
"globals_pattern": "*globals.yml",
"globals_dict": {
# programmatically create your globals here
},
} in the |
Description
Is your feature request related to a problem? A clear and concise description of what the problem is: "I'm always frustrated when ..."
Is there any way to Template configurations based on command line arguments?
I need a data pipeline that has dynamic folder path inside raw_01. E.g. If I am running the pipeline for store_1 then the data will be in raw_01/{store_01}/file.csv, and similarly for store_2
I am able to do this with the DataCatalog API but is there a way to do it with the catalog YML.
Context
Why is this change important to you? How would you use it? How can it benefit other users?
This is useful as when you have multiple models for different stores etc. and the data is organized in folders.
Possible Implementation
(Optional) Suggest an idea for implementing the addition or change.
If there were a way to set the globals_dict in TemplatedConfigLoader with the command line params then that could solve the problem I think.
Possible Alternatives
(Optional) Describe any alternative solutions or features you've considered.
The text was updated successfully, but these errors were encountered: