Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission error for error file when run_as_user is used (bug) #17735

Closed
kcphila opened this issue Aug 19, 2021 · 9 comments
Closed

Permission error for error file when run_as_user is used (bug) #17735

kcphila opened this issue Aug 19, 2021 · 9 comments
Labels
area:core kind:bug This is a clearly a bug

Comments

@kcphila
Copy link

kcphila commented Aug 19, 2021

Apache Airflow Version: 2.1.2

Environment:

  • OS: Linux

What Happens:

When the base_task_runner prepares to execute the task, it opens a temporary error file AND passes the error file name into the task. This temp file is created with default permissions based on the airflow worker, which is usually the airflow OS user.

When the task is then run as another user (via run_as_user='me' ), and the task fails, the task attempts to handle the failure by opening and writing to the error file within the subprocess, which is running as the subordinate user and not as the airflow user. This leads to a permission error on the attempt to open, and before the error email is sent out, which is particularly bad because when the task fails, no one is notified.

What you expect to happen:

Airflow should handle error files without creating its own permission errors.

Options to Correct:

  1. We've temporarily gotten around this by inserting a chmod in line 88. This is a temporary error file that is delete when the task goes out of scope, and so, while ugly and inelegant, this is a simple way to avoid the identified problem and probably not a security risk.
  87 self._error_file = NamedTemporaryFile(delete=True)
  +  os.chmod(self._error_file.name, 0o666)
  88 self._cfg_path = cfg_path
  1. Alternatively, it looks like the there are few uses of the error file by the worker, and that the worker does not write to it. It is possible that the only use is in airflow.jobs.LocalTaskJob following task failure. An alternative solution would be to generate the temp error filename without actually creating the file, passing it to the subprocess as a parameter, and then opening and reading the file if it exists following task failure.

Reference Line of Code:

self._error_file = NamedTemporaryFile(delete=True)

@boring-cyborg
Copy link

boring-cyborg bot commented Aug 19, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@kcphila kcphila changed the title Permission error for error file when run_as_user is used Permission error for error file when run_as_user is used (bug) Aug 20, 2021
@jedcunningham jedcunningham added area:core kind:bug This is a clearly a bug labels Aug 23, 2021
@Hrithiksomani
Copy link

Hrithiksomani commented Aug 29, 2021

Any update on this as we are also facing the same issue in Airflow 2.0.1.

@kcphila
Copy link
Author

kcphila commented Sep 1, 2021

Any update on this as we are also facing the same issue in Airflow 2.0.1.

@Hrithiksomani, I haven't seen anything from the Apache/Airflow team aside from at least adding the tags to it. When I have some free time, I was going to look into the code a bit deeper to see if the second option would be the best solution. On a casual look, I didn't see the error file being used anywhere else, but it may be. Note also that we are using Airflow 2.1.2 and it is present there.

@potiuk
Copy link
Member

potiuk commented Sep 2, 2021

Yeah. We have not looked at it yet, but I think this problem has a "deployment" proposal rather than "airflow" one.

Simply the "other user" that you run should be able to write to the file that is created by Airflow. Maybe we should make it clear in the documentation for "run as user" that the users should be configured this way. Using Linux access right it is not easy to do it in "general" way from Python code in a secure way, without having prior knowledge about the users involved or without being sure that we can use Access Control Lists (which are not available on all file systems/not enabled in all Linux distros, versions etc.). We should make a secure solution that is based on standard POSIX access right scheme (with user, group other) .

A solution that I have in mind (and one that is compatible with the way how "Official Airflow Image" is implemented), is to base it on umask and groups. In the official image, umask is set to 0002 - specifically to make any files and folders created, accessible (including write) to the primary group the user belongs: https://github.com/apache/airflow/blob/main/scripts/in_container/prod/entrypoint_prod.sh#L252 - and we have a stric requirement that ANY user that runs airflow should have group 0 ('root') as primary group.

The 'airflow' user in the image has group '0' (root) set as default group. This is in order to make the image OpenShift compatible and allow the docker container to run as "any" user (it does not have to be "airflow" as long as it has group set to '0'.
More about it here: https://airflow.apache.org/docs/docker-stack/entrypoint.html#allowing-arbitrary-user-to-run-the-container

My proposal is - let's make it a pre-requisite that both airflow and the me user have both same primary group (might be set to '0' as it is in the docker image) and that airflow sets umask to 0002 allowing anyone from it's primary group to write to any folders/files created by the 'airflow` user.

It does not open security issues (all the necessary root user permission are user-based not group-based) and since umask will be set only for airflow, it does not also affect he root-created files and folders, but at the same time all the logs/files/folders that airflow created, become writable for me user .

WDYT? Would it be applicable to your cases? Any concerns you could have with this approach @kcphila @Hrithiksomani ? Why it's not a solution that is "super-generic" I think it is workable and can be applied to your systems even now - without upgrading Airflow even.

@potiuk
Copy link
Member

potiuk commented Sep 3, 2021

I plan to update the docs around the "run_as_user" to make it a "recommended practice" soon. Let me know if you see any problems with it please :)

@ashb
Copy link
Member

ashb commented Sep 3, 2021

Duplicate of #15947

@ashb ashb marked this as a duplicate of #15947 Sep 3, 2021
@ashb ashb closed this as completed Sep 3, 2021
@ashb
Copy link
Member

ashb commented Sep 3, 2021

We can fix this by changing the permission on the file after it's opened -- by doing that, the parent process has an open file handle already so can read it, and the child/sudo'd process can write to it.

@kcphila
Copy link
Author

kcphila commented Sep 3, 2021

Hi @potiuk, thanks for the response. However, this would not work and is deeply problematic. This essentially requires us to completely change our entire user and group management policy... so that Airflow can create a temporary file to log error details for a few minutes at a time. Remember that this issue is solely about a temporary file. I just started setting up Airflow to replace existing cron jobs, and so I can't believe this is a realistic expectation.

Here's a use case that may be useful.

User1 works with sensitive individual data about children in the foster case system. They need a task to run overnight to pull data from a data system only they have access to and put it somewhere on the filesystem that only their primary work group has access.

User2 works with sensitive individual credit data. They need a task to run overnight to pull data from a data system only they have access to and put it somewhere on the filesystem that only their primary work group has access. They do not share groups with User1 and should not be able to see anything owned by User1 or their group.

The data engineer that manages airflow sets up one task to run as User1 and one task to run as User2. Neither User1 nor User2 are or should be managing Airflow, and so they should not be in the airflow group.

Aside from the one temporary file highlighted by my post, Airflow handles the above model just fine, except for this temporary error file. So, I'm looking more closely at the code. The only other times this temporary file are used are to read the contents (line 105, which calls airflow.models.taskinstance.load_error_file, which just reads it once) and to close (which deletes it) (line 178).

..... and @ashb just commented and I see the issue #15947 is indeed the same - and fixed there. Thank you!

@potiuk
Copy link
Member

potiuk commented Sep 3, 2021

Yeah. Better way indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

5 participants