-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kedro-mlflow with kedro 0.19.11 produces multiple runs #624
Comments
Thanks for the bug report. The problem is indeed coming from |
Thanks both, we'll look into this. Unless kedro-mlflow was using a private API, this regression shouldn't have happened. |
I've been able to narrow down which change caused the regression and it's kedro-org/kedro#4353 (cc @merelcht) |
I can reproduce the bug, even without the mlflow.yml and artifacts. Investigation are going on here: https://github.com/Galileo-Galilei/kedro_mlflow_624
|
@ankatiyar @astrojuanlu : I can confirm the bug appears even with a very simple hook that just starts mlflow (see above repo for code and results). My best guess so far is that the before_pipeline_run hook is triggered in another thread than the one the nodes will be running (even with the sequential runner). Since mlflow is thread-safe, it cannot access this different thread (and I cannot even close it within the hook, so the run remains infinietly looping!) |
I will work on fixing this. |
@Galileo-Galilei just to double check that what I've observed is correct: the behaviour described here with the extra run being created was already always the case when using |
Not with ThreadRunner |
I've tried all versions back to The issue here is that after my refactoring in kedro-org/kedro#4353, I fiddled around a bit and managed to get this working by saving the active run ID and passing it on at the point where the parameters are logged, by calling Another solution would be to revert part of my earlier refactoring and not creating an executor pool for |
Hum, weird. Let me look at it this weekend and I'll get back to you so we decide what's best when we fully understand the issue. |
xref kedro-org/kedro#4486 |
@merelcht @astrojuanlu I finally fixed it on kedro-mlflow's side because it's mostly due to the recent development to make mlflow thread-safe. Before mlflow 2.18, this wuold not have been a problem. That said, I think it still need some work from kedro side:
AttributeError: The following datasets cannot be used with multiprocessing: ['model_input_table']
In order to utilize multiprocessing you need to make sure all datasets are serialisable, i.e. datasets should not make use of lambda functions, nested functions, closures etc.
If you are using custom decorators ensure they are correctly decorated using functools.wraps(). |
Thanks @Galileo-Galilei, that makes sense. I agree it's confusing |
Description
Once upgraded kedro to 0.19.11, kedro-mlflow started to produce multiple runs in MLFlow.
Context
There are additional runs in MLFlow once running
kedro run
while only one is expected. When downgrading kedro back to 0.19.10 this problem does not occur.Steps to Reproduce
mlflow.yml
file.kedro run
Expected Result
One run is produced in MLFlow.
Actual Result
MLFlow produces two runs, one with "default" name and the other one with some random name.
Your Environment
kedro
andkedro-mlflow
version used (pip show kedro
andpip show kedro-mlflow
):kedro: 0.19.11 (this problem does not occur in 0.19.10!)
kedro-mlflow: 0.14.0
python -V
):3.11
Mac os Sonoma 14.4
Does the bug also happen with the last version on master?
yes
The text was updated successfully, but these errors were encountered: