[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ed68wz6j.fsf@email.froward.int.ebiederm.org>
Date: Wed, 28 Aug 2024 08:32:20 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Theodore Dubois <tblodt@...oud.com>
Cc: linux-kernel@...r.kernel.org, Ryan Houdek <sonicadvance1@...il.com>,
"Guilherme G. Piccoli" <gpiccoli@...lia.com>, David Hildenbrand
<david@...hat.com>, Kees Cook <keescook@...omium.org>
Subject: Re: [PATCH] prctl: allow prctl_set_mm_exe_file without unmapping
old exe
Theodore Dubois <tblodt@...oud.com> writes:
> As far as I can tell, the original purpose of this check was simply as
> the easiest way to work with a quirk of /proc/self/exe at the time. From
> the original patch[1]:
>
> Note it allows to change /proc/$pid/exe iif there
> are no VM_EXECUTABLE vmas present for current process,
> simply because this feature is a special to C/R
> and mm::num_exe_file_vmas become meaningless after
> that.
>
> num_exe_file_vmas was created to preserve a quirk of the original
> /proc/self/exe implementation: if you unmapped all executable VMAs,
> /proc/self/exe would disappear (because it worked by scanning the
> address space for the first executable VMA.) Keeping the quirk after
> switching to just saving the executable on the mm worked by keeping a
> count of executable VMAs in num_exe_file_vmas, and zeroing exe_file when
> it reached zero. You can probably see how it would have been annoying to
> handle both num_exe_file_vmas and this prctl intending to change
> exe_file, and it's easier to only allow changing exe_file after
> num_exe_file_vmas has already gone to 0 and made exe_file null.
>
> However, num_exe_file_vmas no longer exists[2]. This quirk was taken out
> because it would save a bit in the vma flags, and it seems clear by now
> that nobody was relying on it. These days you can simply update exe_file
> with no interference.
>
> Recently a use case for this prctl has come up outside of
> checkpoint/restore, namely binfmt_misc based emulators such as FEX[3].
> Any program that uses /proc/self/exe will, of course, expect it to point
> to its own executable. But when executed through binfmt_misc, it will be
> the emulator, resulting in compatibility issues. Emulators currently
> have to attempting to intercept syscalls targeting /proc/self/exe to
> redirect the path, and this is not possible in the general case
> considering how flexible path resolution is. For more detail on this see
> [3].
>
> The above seems to me like a solid case for simply dropping the check.
> It's also worth noting that it is already possible to achieve the same
> result by the laborious and complex process of just unmapping all your
> code and remapping it again after the switch (just remember to put the
> code that does this in a .so!), so this is not strictly allowing
> anything that wasn't allowed before. It's just cutting red tape.
One of my original concerns is that allowing changing the /proc/self/exe
has the potential to make /proc/self/exe unreliable and specifically it
has the potential for a rouge program to hide itself by setting a false
/proc/self/exe.
That is part of the reason for the red tape.
Maybe I am wrong but I am concerned that your change may be making it
too easy to change /proc/self/exe, and would too easily allow setting a
false /proc/self/exe.
So it may make better sense to have a special case for interpreters,
so we don't have to worry about people setting a false /proc/self/exe.
Looking at the code I am a bit perplexed at the moment as I don't see
a check currently to ensure the new exe_file is actually mapped.
...
Beyond that your change reduces replace_mm_exe_file to set_mm_exe_file
so it would probably make sense to combine the two of them if you
are going to go this far.
Eric
> [1]: https://lore.kernel.org/lkml/20120316210343.925446961@openvz.org/
> [2]: https://lore.kernel.org/lkml/20120731104230.20515.72416.stgit@zurg/
> [3]: https://lore.kernel.org/lkml/CABnRqDdzqfB1_ixd-2JnfSocKvXNM+9ivM1hhd1C=ejLQyen8g@mail.gmail.com/
>
> Signed-off-by: Theodore Dubois <tblodt@...oud.com>
> Cc: Ryan Houdek <sonicadvance1@...il.com>
> Cc: Guilherme G. Piccoli <gpiccoli@...lia.com>
> Cc: David Hildenbrand <david@...hat.com>
> Cc: Eric W. Biederman <ebiederm@...ssion.com>
> Cc: Kees Cook <keescook@...omium.org>
> ---
> kernel/fork.c | 22 ----------------------
> 1 file changed, 22 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index cc760491f..407e515b9 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1430,30 +1430,8 @@ int set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
> */
> int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
> {
> - struct vm_area_struct *vma;
> - struct file *old_exe_file;
> int ret = 0;
>
> - /* Forbid mm->exe_file change if old file still mapped. */
> - old_exe_file = get_mm_exe_file(mm);
> - if (old_exe_file) {
> - VMA_ITERATOR(vmi, mm, 0);
> - mmap_read_lock(mm);
> - for_each_vma(vmi, vma) {
> - if (!vma->vm_file)
> - continue;
> - if (path_equal(&vma->vm_file->f_path,
> - &old_exe_file->f_path)) {
> - ret = -EBUSY;
> - break;
> - }
> - }
> - mmap_read_unlock(mm);
> - fput(old_exe_file);
> - if (ret)
> - return ret;
> - }
> -
> get_file(new_exe_file);
>
> /* set the new file */
Powered by blists - more mailing lists