-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Applied permissions to self._error_file #15947
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
|
Setting the execution bit looks extremely wrong to me. At the very least this should be |
Agree with @uranusjr . I believe this is quite wrong approach. The right solution should be setting the Umask properly for Airflow deployment. The user that you use with "run as user" should be in the same group as the "airflow user" and umask should be set so that the file is group readable (at least - possibly group writeable). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposed alternative solution with correcting airflow deployment, not airflow code.
Could this also be achieved by opening the file as airflow, before switching user and then changing the owner? That way the |
Hmm. Interesting idea :) I think indeed once you have filehandle, changing ownership should have no impact. Worth trying |
Thank you for the tip! So, instead of using os.chmod we would use os.chown, right? This would avoid every user having access to the file. I will change that. Many thanks! |
@kevinbsilva Does it work now? I think this should, but could you confirm? |
@ashb i've tested in our development cluster and it works fine. But it seems a check from GitHub actions did not succeed. It raised a 404 when trying to install the Airflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed os.chmod to os.chown
@kevinbsilva I think it received a 404 because the URL referenced @potiuk can you take a look and confirm whether your requested changes have been addressed? |
@kevinbsilva Could you rebase these changes to latest main branch please. |
Alternate alternate approach has been done.
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
@kevinbsilva looks like these are legit test failures that need to be addressed https://github.com/apache/airflow/pull/15947/checks?check_run_id=2794600747#step:6:14114 |
@jhtimmins i'm looking into it. Thank you, will update as soon as possible. |
@kevinbsilva please install pre-commit. That would be very helpful |
@ephraimbuddy my mistake. I've read about the static checks and after executing some of the pre-commits, it is ok (mainly black and isort). Thank you very much! I am getting errors from the MSSQL Build, but it don't seems that has any connection with the code update. |
MSSQL is a known issue that is being worked on, not connected to your PR |
try: | ||
os.chown(self._error_file.name, getpwnam(self.run_as_user).pw_uid, -1) | ||
except Exception: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should either bubble the exception or log that the chown failed. Personally I’d prefer not catching the exception for now and fix UX issues when someone complains.
Used os.chmod with 0o0777 umask to avoid Permission Denied errors when accessing the temporary error file.
Instead of using os.chmod and applying permissions to the error file, we are setting the file owner according to the variable self.run_as_user. This would avoid setting hardcoded permissions for the log file while maintaining the correct permissions to the user responsible for the DAG to manipulate the log file.
…other than a string
Is it just me, or this breaks run_as_user completely? How can airflow run chown without sudo? Am I missing something? |
It does look suspiciously wrong here. I think it will only work is airflow is run as root |
@kacpwoja - maybe you can add PR to fix it (unless someone explains why it WOULD work of course) |
Yes, exactly as you said, it only works when airflow is run as root. When it's not, the tasks fail before they even start. I think we can use subprocess.call to invoke chown with sudo instead of using os.chown, just like a few lines above. I'll make a PR to fix this |
Oh yeah, that is my bad. I entirely forgot that even if you own the file currently you can't change the owner unless you are root 🤦🏻 |
Airflow might not be able to sudo as root, but only sudo as a specific user :( |
I think the only solution is to create the file in a separate |
But airflow already needs sudo permissions for chown anyway, as it uses it 20 lines before in the same file: If I recall correctly, Airflow also needs sudo permissions for rm and kill. |
Ah yeah. I looked at the docs https://airflow.apache.org/docs/apache-airflow/2.2.2/security/workload.html and we even mention that those permissions to change ownership of files (and root equivalence) is needed. |
As explained at the issue #15946 the Web UI does not show the log for tasks that failed when the owner of the DAG is not airflow or the DAG is being executed by another user (run_as_user parameter). When airflow is dumping the error to the temporary file, it opens the file for writing and raises a Permission Denied error, since it's created with restricted permissions. Any default user cannot write into the file, only read.
The NamedTemporaryFile created at the base_task_runner creates the file with the umask 600 and that module does not allow to pass custom permissions when creating the object.
To avoid this problem, i've added a os.chmod after creating the NamedTemporaryFile at the base_task_runner module. After that, no Permission Denied is thrown when a task fails and the log is accessible through the Web UI.
No major changes were made, only the chmod being applied solved the problem.
Closes #15946