add recipe dependencies #74

semio · 2017-08-16T01:55:54Z

Jasper Heeffer [7:22 AM]
there is definitely work to do though, to automize sous-chefs and recipes and recipe dependencies so we can have a pipeline from source to final dataset, like we saw in datapackage-pipelines at the conference

Jasper Heeffer [7:39 AM]
on a higher level, traceability is something that is important to us but not yet done (edited)

[7:41]
so that we can see the road of a datapoint from source to e.g. systema globalis

semio · 2017-08-21T03:10:25Z

because we have ingredients section in recipe, which is actually a dependency list, we don't need to add a new section for dependencies. I suggest we add options to ingredient definitions:

- id: datapoint-ingredient
  dataset: source_dataset
  key: geo, time
  options:
      update_source_dataset: true

so when update_source_dataset is true, the source dataset will be updated before it's loaded into chef. This should work for source datasets with recipe or etl script. If there are multiple ingredients from same dataset, this option seems redundant, but if we finally use the dataset as ingredient approach, this problem will be solved.

on the traceability, we can partly achieve by using the to_graph() function demonstrate here https://github.com/semio/ddf_utils/blob/dev/notebook/Chef%20API.ipynb. But if we want to see the road of a datapoint from source to output, I think we need to build a inspector which can stop the chef on each step and see the data inside.

An other thing we might do is to add a web page for the recipe, showing the recipe and its dependency recipes, and a button for each recipe to run the recipe

semio · 2017-09-04T02:30:01Z

also, we can add etl scripts as dependencies, so before the recipe, chef will run the etl scripts first

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add recipe dependencies #74

add recipe dependencies #74

semio commented Aug 16, 2017

semio commented Aug 21, 2017 •

edited

Loading

semio commented Sep 4, 2017

add recipe dependencies #74

add recipe dependencies #74

Comments

semio commented Aug 16, 2017

semio commented Aug 21, 2017 • edited Loading

semio commented Sep 4, 2017

semio commented Aug 21, 2017 •

edited

Loading