You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Jasper Heeffer [7:22 AM]
there is definitely work to do though, to automize sous-chefs and recipes and recipe dependencies so we can have a pipeline from source to final dataset, like we saw in datapackage-pipelines at the conference
Jasper Heeffer [7:39 AM]
on a higher level, traceability is something that is important to us but not yet done (edited)
[7:41]
so that we can see the road of a datapoint from source to e.g. systema globalis
The text was updated successfully, but these errors were encountered:
because we have ingredients section in recipe, which is actually a dependency list, we don't need to add a new section for dependencies. I suggest we add options to ingredient definitions:
so when update_source_dataset is true, the source dataset will be updated before it's loaded into chef. This should work for source datasets with recipe or etl script. If there are multiple ingredients from same dataset, this option seems redundant, but if we finally use the dataset as ingredient approach, this problem will be solved.
on the traceability, we can partly achieve by using the to_graph() function demonstrate here https://github.com/semio/ddf_utils/blob/dev/notebook/Chef%20API.ipynb. But if we want to see the road of a datapoint from source to output, I think we need to build a inspector which can stop the chef on each step and see the data inside.
An other thing we might do is to add a web page for the recipe, showing the recipe and its dependency recipes, and a button for each recipe to run the recipe
Jasper Heeffer [7:22 AM]
there is definitely work to do though, to automize sous-chefs and recipes and recipe dependencies so we can have a pipeline from source to final dataset, like we saw in datapackage-pipelines at the conference
Jasper Heeffer [7:39 AM]
on a higher level, traceability is something that is important to us but not yet done (edited)
[7:41]
so that we can see the road of a datapoint from source to e.g. systema globalis
The text was updated successfully, but these errors were encountered: