Allow Running kedro-mlflow projects with an MLflow orchestrator #358

takikadiri · 2022-09-13T19:41:57Z

Description

kedro projects are mostly executed with kedro and kedro-mlflow is responsible in starting a new MLflow run/session with a given configs.
There are some scenarios where the kedro project could be executed with some sort of orchestrators, such as MLflow project, or an Airflow pipeline. Theses orchestrators can start themeseleves an MLflow RUN to take the control of the overall session. for example :

MLflow project that start an MLflow RUN where it put all the execution context before running the kedro project
An airflow Job that Start Run, execute kedro project, then get the resuts from the RUN to register or deploy the model

Context

We want to use MLflow project so we can run the kedro project from remote repo (for reproductibility) and fit the python environnement alongside with the fitted model (for accurate code dependencies)

This feature can also enable the integration of kedro-mlflow with more upstream tools

Possible Implementation

Maybe we can check here if mlflow have already an active RUN, if it's the case, we can use it when starting the kedro-mlflow run

Galileo-Galilei · 2022-09-20T20:41:56Z

In such a situation, what is the expected behaviour at the end of the pipeline? Do we expect the run to be closed? The other problem is that if mlflow is not properly configured by the orchestrator, the active run may be located in another tracking_uri than the one specified in the configuration, hence raising a mlflow.exceptions.MlflowException: Run 'xxx' not found error.

The easiest way to inject behaviour would be to pass the tracking.run.id to the configuration, but it requires the orchestrator modifying the config...

Galileo-Galilei · 2022-10-02T21:04:22Z

So the final decision is:

if an active mlflow run exists, we ignore all configuration in mlflow.yml and uses the configuration from environment
the pipeline logs in this active run
the mlflow run is NOT closed at the end of the kedro run

takikadiri · 2023-01-08T15:24:39Z

That looks good to me. It makes sense to delegate the entire session to the entity that created the run in the first place.

…ble use within an orchestrator (#358)

Galileo-Galilei self-assigned this Sep 13, 2022

Galileo-Galilei added the enhancement label Sep 13, 2022

Galileo-Galilei added this to the 0.11.4 milestone Sep 13, 2022

Galileo-Galilei mentioned this issue Sep 27, 2022

Enable using an active run in kedro-mlflow #359

Merged

6 tasks

Galileo-Galilei added a commit that referenced this issue Jan 9, 2023

✨ Use active mlflow run if it was started before the kedro run to ena…

38c217d

…ble use within an orchestrator (#358)

Galileo-Galilei closed this as completed in #359 Jan 9, 2023

Galileo-Galilei added a commit that referenced this issue Jan 9, 2023

✨ Use active mlflow run if it was started before the kedro run to ena…

503f65c

…ble use within an orchestrator (#358)

Galileo-Galilei moved this to ✅ Done in kedro-mlflow roadmap Oct 29, 2024

Galileo-Galilei added this to kedro-mlflow roadmap Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow Running kedro-mlflow projects with an MLflow orchestrator #358

Allow Running kedro-mlflow projects with an MLflow orchestrator #358

takikadiri commented Sep 13, 2022 •

edited

Loading

Galileo-Galilei commented Sep 20, 2022 •

edited

Loading

Galileo-Galilei commented Oct 2, 2022

takikadiri commented Jan 8, 2023

Allow Running kedro-mlflow projects with an MLflow orchestrator #358

Allow Running kedro-mlflow projects with an MLflow orchestrator #358

Comments

takikadiri commented Sep 13, 2022 • edited Loading

Description

Context

Possible Implementation

Galileo-Galilei commented Sep 20, 2022 • edited Loading

Galileo-Galilei commented Oct 2, 2022

takikadiri commented Jan 8, 2023

takikadiri commented Sep 13, 2022 •

edited

Loading

Galileo-Galilei commented Sep 20, 2022 •

edited

Loading