linux-kernel - Re: [PATCH] allow execve'ing "/proc/self/exe" even if /proc is not mounted

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090624162125.a3a9b2c4.akpm@linux-foundation.org>
Date:	Wed, 24 Jun 2009 16:21:25 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Denys Vlasenko <vda.linux@...glemail.com>
Cc:	linux-kernel@...r.kernel.org, vapier@...too.org
Subject: Re: [PATCH] allow execve'ing "/proc/self/exe" even if /proc is not
  mounted

On Thu, 25 Jun 2009 01:00:56 +0200
Denys Vlasenko <vda.linux@...glemail.com> wrote:

> In some circumstances running process needs to re-execute
> its image.
> 
> Among other useful cases, it is _crucial_ for NOMMU arches.
> 
> They need it to perform daemonization. Classic sequence
> of "fork, parent dies, child continues" can't be used
> due to lack of fork on NOMMU, and instead we have to do
> "vfork, child re-exec itself (with a flag to not daemonize)
> and therefore unblocks parent, parent dies".
> 
> Another crucial use case on NOMMU is POSIX shell support.
> Imagine a shell command of the form "func1 | func2 | func3".
> This can be implemented on NOMMU by vforking thrice,
> re-executing the shell in every child in the form
> "<shell> -c 'body of funcN'", and letting parent wait and collect
> exitcodes and such. As far as I can see, it's the only way
> to implement it correctly on NOMMU.
> 
> The program may re-execute itself by name if it knows the name,
> but we generally may be unsure about it. Binary may be renamed,
> or even deleted while it is being run.
> 
> More elegant way is to execute /proc/self/exe.
> This works just fine as long as /proc is mounted.
> 
> But it breaks if /proc isn't mounted, and this can happen in real-world
> usage. For example, when shell invoked very early in initrd/initramfs.

Why can't userspace mount /proc before doing the daemonization?

> With this patch, it is possible to execute /proc/self/exe
> even if /proc is not mounted. In the below example,
> ./sh is a static shell binary:
> 
> # chroot . ./sh
> / # echo $0
> ./sh
> / # . /proc/self/exe
> hush: /proc/self/exe: No such file or directory
> / # /proc/self/exe   <==========
> / # echo $0
> /proc/self/exe
> / # exit
> / # exit
> #
> 
> On an unpatched kernel, command marked with <=== would fail.
> 
> How patch does it: when execve syscall discovers that opening of binary
> image fails, a small bit of code is added to special case "/proc/self/exe"
> string. If binary name is *exactly* that string, and if error is ENOENT
> or EACCES, then exec will still succeed, using current binary's image.
> 
> Please apply.
> 
>
> diff -urp ../linux-2.6.30.org/fs/exec.c linux-2.6.30/fs/exec.c
> --- ../linux-2.6.30.org/fs/exec.c	2009-06-10 05:05:27.000000000 +0200
> +++ linux-2.6.30/fs/exec.c	2009-06-25 00:20:13.000000000 +0200
> @@ -652,9 +652,25 @@ struct file *open_exec(const char *name)
>  	file = do_filp_open(AT_FDCWD, name,
>  				O_LARGEFILE | O_RDONLY | FMODE_EXEC, 0,
>  				MAY_EXEC | MAY_OPEN);
> -	if (IS_ERR(file))
> -		goto out;
> +	if (IS_ERR(file)) {
> +		if ((PTR_ERR(file) == -ENOENT || PTR_ERR(file) == -EACCES)
> +		 && strcmp(name, "/proc/self/exe") == 0
> +		) {
> +			struct file *sv = file;
> +			struct mm_struct *mm;
>  
> +			mm = get_task_mm(current);
> +			if (!mm)
> +				goto out;
> +			file = get_mm_exe_file(mm);
> +			mmput(mm);
> +			if (file)
> +				goto ok;
> +			file = sv;
> +		}
> +		goto out;
> +	}
> +ok:
>  	err = -EACCES;
>  	if (!S_ISREG(file->f_path.dentry->d_inode->i_mode))
>  		goto exit;

Oh geeze.  Hard-coded "/proc/self/exec" it the middle of the core exec
code?  You're a brave man.

Relatively minor observations:

- The code layout is weird

- This hack should be hidden in a separate function, not splattered
  all over the middle of open_exec().

- That function should be documented in a way which will permit
  readers to understand why it exists.


But don't do any of that yet.  This will be an unpopular patch and I
fear for its future ;)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/