Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store compiled SQL as template field for ExecutionMode.AIRFLOW_ASYNC #1534

Merged
merged 6 commits into from
Feb 14, 2025

Conversation

pankajkoti
Copy link
Contributor

@pankajkoti pankajkoti commented Feb 12, 2025

This PR enhances the ExecutionMode.AIRFLOW_ASYNC execution flow by ensuring that the compiled SQL is stored as a template field.

Key Changes

  • Added _store_compiled_sql method to store the compiled SQL as a template field (compiled_sql).
  • Ensured that the compiled SQL is saved when enable_setup_async_task is enabled.
  • Modified execute_complete for the extended operator as the parent class does not have that as a templated field and any attempt to save the template field for deferrable operator, wipes off the rendered field. Hence, calling _store_compiled_sql after execution completes.

Why This Change?

  • Improves observability by making the compiled SQL easily accessible in the Airflow UI.
  • Ensures consistency with other execution modes.
  • Enables users to debug and track executed queries more effectively.
Screenshot 2025-02-12 at 12 02 43 PM

closes: #1490

Copy link

cloudflare-workers-and-pages bot commented Feb 12, 2025

Deploying astronomer-cosmos with  Cloudflare Pages  Cloudflare Pages

Latest commit: 86eeb90
Status: ✅  Deploy successful!
Preview URL: https://e005a7a1.astronomer-cosmos.pages.dev
Branch Preview URL: https://store-compile-sql-template-f.astronomer-cosmos.pages.dev

View logs

Copy link

netlify bot commented Feb 12, 2025

Deploy Preview for sunny-pastelito-5ecb04 ready!

Name Link
🔨 Latest commit 86eeb90
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/67aee0a5cd56cc000851f897
😎 Deploy Preview https://deploy-preview-1534--sunny-pastelito-5ecb04.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codecov bot commented Feb 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.33%. Comparing base (745ed14) to head (86eeb90).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1534      +/-   ##
==========================================
+ Coverage   97.32%   97.33%   +0.01%     
==========================================
  Files          80       80              
  Lines        4821     4846      +25     
==========================================
+ Hits         4692     4717      +25     
  Misses        129      129              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pankajkoti pankajkoti marked this pull request as ready for review February 13, 2025 09:56
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 13, 2025
@dosubot dosubot bot added area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc profile:bigquery Related to BigQuery ProfileConfig labels Feb 13, 2025
@pankajkoti pankajkoti force-pushed the store-compile-sql-template-field branch from 4904287 to 909721c Compare February 13, 2025 10:15
@pankajkoti pankajkoti force-pushed the store-compile-sql-template-field branch from 259d00c to 86eeb90 Compare February 14, 2025 06:20
@pankajkoti pankajkoti merged commit 53dfcb7 into main Feb 14, 2025
66 checks passed
@pankajkoti pankajkoti deleted the store-compile-sql-template-field branch February 14, 2025 06:28
@pankajkoti pankajkoti mentioned this pull request Feb 14, 2025
pankajkoti added a commit that referenced this pull request Feb 20, 2025
Breaking changes

* When using ``LoadMode.DBT_LS``, Cosmos will now attempt to use the
``dbtRunner`` as opposed to subprocess to run ``dbt ls``.
While this represents significant performance improvements (half the
vCPU usage and some memory consumption improvement), this may not work
in
scenarios where users had multiple Python virtual environments to manage
different versions of dbt and its adaptors. In those cases,
please, set ``RenderConfig(invocation_mode=InvocationMode.SUBPROCESS)``
to have the same behaviour Cosmos had in previous versions.
Additional information `here
<https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls>`_
and `here
<https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#how-to-run-dbt-ls-invocation-mode>`_.

Features

* Use ``dbtRunner`` in the DAG Processor when using ``LoadMode.DBT_LS``
if ``dbt-core`` is available by @tatiana in #1484. Additional
information `here
<https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls>`_.
* Allow users to opt-out of ``dbtRunner`` during DAG parsing with
``InvocationMode.SUBPROCESS`` by @tatiana in #1495. Check out the
`documentation
<https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#how-to-run-dbt-ls-invocation-mode>`_.
* Add structure to support multiple db for async operator execution by
@pankajastro in #1483
* Support overriding the ``profile_config`` per dbt node or folder using
config by @tatiana in #1492. More information `here
<https://astronomer.github.io/astronomer-cosmos/profiles/#profile-customise-per-node>`_.
* Create and run accurate SQL statements when using
``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti, @tatiana and
@pankajastro in #1474
* Add AWS ECS task run execution mode by @CarlosGitto and @aoelvp94 in
#1507
* Add support for running ``DbtSourceOperator`` individually by
@victormacaubas in #1510
* Add setup task for async executions by @pankajastro in #1518
* Add teardown task for async executions by @pankajastro in #1529
* Add ``ProjectConfig.install_dbt_deps`` & change operator
``install_deps=True`` as default by @tatiana in #1521
* Extend Virtualenv operator and mock dbt adapters for setup & teardown
tasks in ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti, @tatiana and
@pankajastro in #1544

Bug Fixes

* Fix select complex intersection of three tag-based graph selectors by
@tatiana in #1466
* Fix custom selector behaviour when the model name contains periods by
@yakovlevvs and @60098727 in #1499
* Filter dbt and non-dbt kwargs correctly for async operator by
@pankajastro in #1526

Enhancement

* Fix OpenLineage deprecation warning by @CorsettiS in #1449
* Move ``DbtRunner`` related functions into ``dbt/runner.py`` module by
@tatiana in #1480
* Add ``on_warning_callback`` to ``DbtSourceKubernetesOperator`` and
refactor previous operators by @LuigiCerone in #1501
* Gracefully error when users set incompatible ``RenderConfig.dbt_deps``
and ``operator_args`` ``install_deps`` by @tatiana in #1505
* Store compiled SQL as template field for
``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti in #1534

Docs

* Improve ``RenderConfig`` arguments documentation by @tatiana in #1514
* Improve callback documentation by @tatiana in #1516
* Improve partial parsing docs by @tatiana in #1520
* Fix typo in selecting & excluding docs by @pankajastro in #1523
* Document ``async_py_requirements`` added in ``ExecutionConfig`` for
``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti in #1545

Others

* Ignore dbt package tests when running Cosmos tests by @tatiana in
#1502
* Refactor to consolidate async dbt adapter code by @pankajkoti in #1509
* Log elapsed time for sql file(s) upload/download by @pankajastro in
#1536
* Remove the fallback operator for async task by @pankajastro in #1538
* GitHub Actions Dependabot: #1487
* Pre-commit updates: #1473, #1493, #1503, #1531
pankajkoti added a commit that referenced this pull request Feb 20, 2025
Breaking changes

* When using ``LoadMode.DBT_LS``, Cosmos will now attempt to use the
``dbtRunner`` as opposed to subprocess to run ``dbt ls``.
While this represents significant performance improvements (half the
vCPU usage and some memory consumption improvement), this may not work
in
scenarios where users had multiple Python virtual environments to manage
different versions of dbt and its adaptors. In those cases,
please, set ``RenderConfig(invocation_mode=InvocationMode.SUBPROCESS)``
to have the same behaviour Cosmos had in previous versions.
Additional information `here
<https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls>`_
and `here
<https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#how-to-run-dbt-ls-invocation-mode>`_.

Features

* Use ``dbtRunner`` in the DAG Processor when using ``LoadMode.DBT_LS``
if ``dbt-core`` is available by @tatiana in #1484. Additional
information `here
<https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls>`_.
* Allow users to opt-out of ``dbtRunner`` during DAG parsing with
``InvocationMode.SUBPROCESS`` by @tatiana in #1495. Check out the
`documentation
<https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#how-to-run-dbt-ls-invocation-mode>`_.
* Add structure to support multiple db for async operator execution by
@pankajastro in #1483
* Support overriding the ``profile_config`` per dbt node or folder using
config by @tatiana in #1492. More information `here
<https://astronomer.github.io/astronomer-cosmos/profiles/#profile-customise-per-node>`_.
* Create and run accurate SQL statements when using
``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti, @tatiana and
@pankajastro in #1474
* Add AWS ECS task run execution mode by @CarlosGitto and @aoelvp94 in
#1507
* Add support for running ``DbtSourceOperator`` individually by
@victormacaubas in #1510
* Add setup task for async executions by @pankajastro in #1518
* Add teardown task for async executions by @pankajastro in #1529
* Add ``ProjectConfig.install_dbt_deps`` & change operator
``install_deps=True`` as default by @tatiana in #1521
* Extend Virtualenv operator and mock dbt adapters for setup & teardown
tasks in ``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti, @tatiana and
@pankajastro in #1544

Bug Fixes

* Fix select complex intersection of three tag-based graph selectors by
@tatiana in #1466
* Fix custom selector behaviour when the model name contains periods by
@yakovlevvs and @60098727 in #1499
* Filter dbt and non-dbt kwargs correctly for async operator by
@pankajastro in #1526

Enhancement

* Fix OpenLineage deprecation warning by @CorsettiS in #1449
* Move ``DbtRunner`` related functions into ``dbt/runner.py`` module by
@tatiana in #1480
* Add ``on_warning_callback`` to ``DbtSourceKubernetesOperator`` and
refactor previous operators by @LuigiCerone in #1501
* Gracefully error when users set incompatible ``RenderConfig.dbt_deps``
and ``operator_args`` ``install_deps`` by @tatiana in #1505
* Store compiled SQL as template field for
``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti in #1534

Docs

* Improve ``RenderConfig`` arguments documentation by @tatiana in #1514
* Improve callback documentation by @tatiana in #1516
* Improve partial parsing docs by @tatiana in #1520
* Fix typo in selecting & excluding docs by @pankajastro in #1523
* Document ``async_py_requirements`` added in ``ExecutionConfig`` for
``ExecutionMode.AIRFLOW_ASYNC`` by @pankajkoti in #1545

Others

* Ignore dbt package tests when running Cosmos tests by @tatiana in
#1502
* Refactor to consolidate async dbt adapter code by @pankajkoti in #1509
* Log elapsed time for sql file(s) upload/download by @pankajastro in
#1536
* Remove the fallback operator for async task by @pankajastro in #1538
* GitHub Actions Dependabot: #1487
* Pre-commit updates: #1473, #1493, #1503, #1531
@tatiana tatiana added this to the Cosmos 1.9.0 milestone Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc profile:bigquery Related to BigQuery ProfileConfig size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Display rendered SQL from async operators in the Airflow UI using the monkeypatched approach
3 participants