-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow users to opt-out dbtRunner
during DAG parsing
#1495
Conversation
dbtRunner
during DAG parsing
✅ Deploy Preview for sunny-pastelito-5ecb04 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Deploying astronomer-cosmos with
|
Latest commit: |
699db27
|
Status: | ✅ Deploy successful! |
Preview URL: | https://bb12c50c.astronomer-cosmos.pages.dev |
Branch Preview URL: | https://opt-out-dbtrunner-in-dbt-ls.astronomer-cosmos.pages.dev |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1495 +/- ##
=======================================
Coverage 97.05% 97.05%
=======================================
Files 77 77
Lines 4483 4484 +1
=======================================
+ Hits 4351 4352 +1
Misses 132 132 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I guess would be nice to highlight in our documentation somewhere that starting Cosmos 1.9, we're optimising to use dbtRunner by default for parsing and also show them how they can change the default with the corresponding param in their DAG. WDYT?
Thanks for the feedback, @pankajastro & @pankajkoti ! I've added a breaking change notice in the changelog: 14b0a2c As well as docs in 53429d9 |
Breaking changes * When using ``LoadMode.DBT_LS``, Cosmos will now attempt to use the ``dbtRunner`` as opposed to subprocess to run ``dbt ls``. While this represents significant performance improvements (half the vCPU usage and some memory consumption improvement), this may not work in scenarios where users had multiple Python virtual environments to manage different versions of dbt and its adaptors. In those cases, please, set ``RenderConfig(invocation_mode=InvocationMode.SUBPROCESS)`` to have the same behaviour Cosmos had in previous versions. Additional information `here <https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls>`_ and `here <https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#how-to-run-dbt-ls-invocation-mode>`_. Features * Use ``dbtRunner`` in the DAG Processor when using ``LoadMode.DBT_LS`` if ``dbt-core`` is available by @tatiana in #1484. Additional information `here <https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls>`_. * Allow users to opt-out of ``dbtRunner`` during DAG parsing with ``InvocationMode.SUBPROCESS`` by @tatiana in #1495. Check out the `documentation <https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#how-to-run-dbt-ls-invocation-mode>`_. * Add structure to support multiple db for async operator execution by @pankajastro in #1483 * Support overriding the ``profile_config`` per dbt node or folder using config by @tatiana in #1492. More information `here <https://astronomer.github.io/astronomer-cosmos/profiles/#profile-customise-per-node>`_. * Create and run accurate SQL statements when using ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti, @tatiana and @pankajastro in #1474 * Add AWS ECS task run execution mode by @CarlosGitto and @aoelvp94 in #1507 * Add support for running ``DbtSourceOperator`` individually by @victormacaubas in #1510 * Add setup task for async executions by @pankajastro in #1518 * Add teardown task for async executions by @pankajastro in #1529 * Add ``ProjectConfig.install_dbt_deps`` & change operator ``install_deps=True`` as default by @tatiana in #1521 * Extend Virtualenv operator and mock dbt adapters for setup & teardown tasks in ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti, @tatiana and @pankajastro in #1544 Bug Fixes * Fix select complex intersection of three tag-based graph selectors by @tatiana in #1466 * Fix custom selector behaviour when the model name contains periods by @yakovlevvs and @60098727 in #1499 * Filter dbt and non-dbt kwargs correctly for async operator by @pankajastro in #1526 Enhancement * Fix OpenLineage deprecation warning by @CorsettiS in #1449 * Move ``DbtRunner`` related functions into ``dbt/runner.py`` module by @tatiana in #1480 * Add ``on_warning_callback`` to ``DbtSourceKubernetesOperator`` and refactor previous operators by @LuigiCerone in #1501 * Gracefully error when users set incompatible ``RenderConfig.dbt_deps`` and ``operator_args`` ``install_deps`` by @tatiana in #1505 * Store compiled SQL as template field for ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti in #1534 Docs * Improve ``RenderConfig`` arguments documentation by @tatiana in #1514 * Improve callback documentation by @tatiana in #1516 * Improve partial parsing docs by @tatiana in #1520 * Fix typo in selecting & excluding docs by @pankajastro in #1523 * Document ``async_py_requirements`` added in ``ExecutionConfig`` for ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti in #1545 Others * Ignore dbt package tests when running Cosmos tests by @tatiana in #1502 * Refactor to consolidate async dbt adapter code by @pankajkoti in #1509 * Log elapsed time for sql file(s) upload/download by @pankajastro in #1536 * Remove the fallback operator for async task by @pankajastro in #1538 * GitHub Actions Dependabot: #1487 * Pre-commit updates: #1473, #1493, #1503, #1531
Breaking changes * When using ``LoadMode.DBT_LS``, Cosmos will now attempt to use the ``dbtRunner`` as opposed to subprocess to run ``dbt ls``. While this represents significant performance improvements (half the vCPU usage and some memory consumption improvement), this may not work in scenarios where users had multiple Python virtual environments to manage different versions of dbt and its adaptors. In those cases, please, set ``RenderConfig(invocation_mode=InvocationMode.SUBPROCESS)`` to have the same behaviour Cosmos had in previous versions. Additional information `here <https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls>`_ and `here <https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#how-to-run-dbt-ls-invocation-mode>`_. Features * Use ``dbtRunner`` in the DAG Processor when using ``LoadMode.DBT_LS`` if ``dbt-core`` is available by @tatiana in #1484. Additional information `here <https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls>`_. * Allow users to opt-out of ``dbtRunner`` during DAG parsing with ``InvocationMode.SUBPROCESS`` by @tatiana in #1495. Check out the `documentation <https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#how-to-run-dbt-ls-invocation-mode>`_. * Add structure to support multiple db for async operator execution by @pankajastro in #1483 * Support overriding the ``profile_config`` per dbt node or folder using config by @tatiana in #1492. More information `here <https://astronomer.github.io/astronomer-cosmos/profiles/#profile-customise-per-node>`_. * Create and run accurate SQL statements when using ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti, @tatiana and @pankajastro in #1474 * Add AWS ECS task run execution mode by @CarlosGitto and @aoelvp94 in #1507 * Add support for running ``DbtSourceOperator`` individually by @victormacaubas in #1510 * Add setup task for async executions by @pankajastro in #1518 * Add teardown task for async executions by @pankajastro in #1529 * Add ``ProjectConfig.install_dbt_deps`` & change operator ``install_deps=True`` as default by @tatiana in #1521 * Extend Virtualenv operator and mock dbt adapters for setup & teardown tasks in ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti, @tatiana and @pankajastro in #1544 Bug Fixes * Fix select complex intersection of three tag-based graph selectors by @tatiana in #1466 * Fix custom selector behaviour when the model name contains periods by @yakovlevvs and @60098727 in #1499 * Filter dbt and non-dbt kwargs correctly for async operator by @pankajastro in #1526 Enhancement * Fix OpenLineage deprecation warning by @CorsettiS in #1449 * Move ``DbtRunner`` related functions into ``dbt/runner.py`` module by @tatiana in #1480 * Add ``on_warning_callback`` to ``DbtSourceKubernetesOperator`` and refactor previous operators by @LuigiCerone in #1501 * Gracefully error when users set incompatible ``RenderConfig.dbt_deps`` and ``operator_args`` ``install_deps`` by @tatiana in #1505 * Store compiled SQL as template field for ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti in #1534 Docs * Improve ``RenderConfig`` arguments documentation by @tatiana in #1514 * Improve callback documentation by @tatiana in #1516 * Improve partial parsing docs by @tatiana in #1520 * Fix typo in selecting & excluding docs by @pankajastro in #1523 * Document ``async_py_requirements`` added in ``ExecutionConfig`` for ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti in #1545 Others * Ignore dbt package tests when running Cosmos tests by @tatiana in #1502 * Refactor to consolidate async dbt adapter code by @pankajkoti in #1509 * Log elapsed time for sql file(s) upload/download by @pankajastro in #1536 * Remove the fallback operator for async task by @pankajastro in #1538 * GitHub Actions Dependabot: #1487 * Pre-commit updates: #1473, #1493, #1503, #1531
While speaking to a customer about #1484, they mentioned they have the following setup:
dbt-databricks
installed in the same Python virtualenv as Cosmos/Airflowdbt-bigquery
installed in a separate Python virtualenv using Astro DockerfileAnd run DAGs using both with the same image. This means 1.9.0a3 breaks them since they use
LoadMode.DBT_LS
and onlydebt-data bricks
can be parsed. This means that we have to add support to allow users to opt-in / out of using thedbtRunner
during DAG parsing - similar to what was done for task execution, inExecutionConfig
.