Fix labels used to find queued KubeExecutor pods #19904

jedcunningham · 2021-11-30T20:33:04Z

We need to use the job_id used to queue the TI, not the current
schedulers job_id. These can differ naturally with HA schedulers and
with scheduler restarts (clearing "queued but not launched TIs" happens
before adoption).

We need to use the job_id used to queue the TI, not the current schedulers job_id. These can differ naturally with HA schedulers and with scheduler restarts (clearing "queued but not launched TIs" happens before adoption).

dstandish · 2021-11-30T21:29:59Z

airflow/executors/kubernetes_executor.py

@@ -473,7 +472,7 @@ def clear_not_launched_queued_tasks(self, session=None) -> None:
            base_label_selector = (
                f"dag_id={pod_generator.make_safe_label_value(task.dag_id)},"
                f"task_id={pod_generator.make_safe_label_value(task.task_id)},"
-                f"airflow-worker={pod_generator.make_safe_label_value(str(self.scheduler_job_id))}"
+                f"airflow-worker={pod_generator.make_safe_label_value(str(task.queued_by_job_id))}"


what if the scenario is this

pod is created, scheduler dies

scheduler back online

scheduler starts up the task again... i assume maybe it gets a new queued_by_job_id -- but the pod has the old id?

does the queued by id even matter? should we not just be looking at things that uniquely identify the TI?

Note: It's using the queued_by_job_id from the TI, which will match the label. Eventually the pod will be adopted by another (or the new) scheduler, however, this can run before that happens (it's on an interval after all) AND this does run before adoption when a scheduler starts.

Now, does it even matter? For better or worse, we use the job_id to help determine which Airflow the task is part of. For example, we watch for events based on the job id as well:

airflow/airflow/executors/kubernetes_executor.py

Line 129 in 33a4502

kwargs = {'label_selector': f'airflow-worker={scheduler_job_id}'}

This becomes important if we consider a shared namespace with multiple Airflow worker pods in it. It becomes even more important if we have the same dags/tasks/scheduled runs. There are certainly still issues here, but this is at least status quo for now until we can properly fix everything.

Here is another example in the adoption process:

airflow/airflow/executors/kubernetes_executor.py

Line 682 in 33a4502

kwargs = {'label_selector': f'airflow-worker={scheduler_job_id}'}

Thanks

Note: It's using the queued_by_job_id from the TI, which will match the label. Eventually the pod will be adopted by another (or the new) scheduler, however, this can run before that happens (it's on an interval after all) AND this does run before adoption when a scheduler starts.

OK so you are saying that this process (i.e. read TI from db, and use it to build the labels, then search for the pod), generally speaking, will happen before the task would e.g. be failed and retried (at which time presumably it would get a new queued by job id, after which if the pod was still out there it would no longer be found in this way). If I have you right, then that makes sense and looks good.

This becomes important if we consider a shared namespace with multiple Airflow worker pods in it. It becomes even more important if we have the same dags/tasks/scheduled runs

Makes sense. But that would not sound like a good idea!

Separate note... airflow-worker is a misnomer right? That makes it sound like celery worker... though i have also seen it used to refer to k8s exec task pods.... and it's not that either....

But yeah i mean now that you mention it... scheduler job ids are probably just incrementing integers no? In that case it would not be the most rock solid of selectors in the shared namespace scenario. Though i'm sure would generally be very rare, maybe there should be a random cluster identifier created somewhere in the db with airflow db init that could be used as a more direct way of keeping things separate

Yeah, that is exactly my plan eventually, generate a random identifier for the instance. Unfortunately I don't think we have anything we can use right now for that purpose.

dstandish · 2021-12-01T05:43:37Z

tests/executors/test_kubernetes_executor.py

+        ti.refresh_from_db()
+        assert ti.state == State.SCHEDULED
+        assert mock_kube_client.list_namespaced_pod.call_count == 2
+        mock_kube_client.list_namespaced_pod.assert_any_call(


i tried assert_has_calls where you can specify the exact call list (which would be ever so marginally more direct than the approach here but apparently it does not work with MagicMock only Mock

github-actions · 2021-12-01T05:44:17Z

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

We need to use the job_id used to queue the TI, not the current schedulers job_id. These can differ naturally with HA schedulers and with scheduler restarts (clearing "queued but not launched TIs" happens before adoption). (cherry picked from commit b80084a)

We need to use the job_id used to queue the TI, not the current schedulers job_id. These can differ naturally with HA schedulers and with scheduler restarts (clearing "queued but not launched TIs" happens before adoption). (cherry picked from commit b80084a) (cherry picked from commit 9cea821)

Fix labels used to find queued KubeExecutor pods

45f2ee7

We need to use the job_id used to queue the TI, not the current schedulers job_id. These can differ naturally with HA schedulers and with scheduler restarts (clearing "queued but not launched TIs" happens before adoption).

jedcunningham requested review from ashb, kaxil and XD-DENG as code owners November 30, 2021 20:33

boring-cyborg bot added provider:cncf-kubernetes Kubernetes provider related issues area:Scheduler including HA (high availability) scheduler labels Nov 30, 2021

jedcunningham requested a review from dstandish November 30, 2021 20:34

dstandish reviewed Nov 30, 2021

View reviewed changes

dstandish approved these changes Dec 1, 2021

View reviewed changes

github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Dec 1, 2021

kaxil approved these changes Dec 1, 2021

View reviewed changes

kaxil merged commit b80084a into apache:main Dec 1, 2021

kaxil deleted the fix_kubeexecutor_clear_queued branch December 1, 2021 08:02

jedcunningham added this to the Airflow 2.2.3 milestone Dec 1, 2021

jedcunningham added the type:bug-fix Changelog: Bug Fixes label Dec 8, 2021

jedcunningham mentioned this pull request Dec 10, 2021

Status of testing of Apache Airflow 2.2.3rc2 #20208

Closed

38 tasks

jedcunningham mentioned this pull request Jan 4, 2022

Race condition when running multiple schedulers #19038

Closed

2 tasks

potiuk mentioned this pull request Feb 7, 2022

Airflow launched two identical tasks in parallel with K8S executor #21375

Closed

2 tasks

jasgok mentioned this pull request Jun 28, 2023

v2.2.2 kubernetes executor + multiple schedulers = start multiple pods instance from same task #32168

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix labels used to find queued KubeExecutor pods #19904

Fix labels used to find queued KubeExecutor pods #19904

jedcunningham commented Nov 30, 2021

dstandish Nov 30, 2021

jedcunningham Nov 30, 2021 •

edited

Loading

jedcunningham Nov 30, 2021

dstandish Dec 1, 2021

jedcunningham Dec 1, 2021

dstandish Dec 1, 2021

github-actions bot commented Dec 1, 2021

Fix labels used to find queued KubeExecutor pods #19904

Fix labels used to find queued KubeExecutor pods #19904

Conversation

jedcunningham commented Nov 30, 2021

dstandish Nov 30, 2021

Choose a reason for hiding this comment

jedcunningham Nov 30, 2021 • edited Loading

Choose a reason for hiding this comment

jedcunningham Nov 30, 2021

Choose a reason for hiding this comment

dstandish Dec 1, 2021

Choose a reason for hiding this comment

jedcunningham Dec 1, 2021

Choose a reason for hiding this comment

dstandish Dec 1, 2021

Choose a reason for hiding this comment

github-actions bot commented Dec 1, 2021

jedcunningham Nov 30, 2021 •

edited

Loading