linux-kernel - Re: [PATCHv4 RESEND 0/3] syscalls,x86: Add execveat() system call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrW1bWcRZxJcKS0tdCxy++Y=ndoxeh5HuZrLMLHnk+i5NA@mail.gmail.com>
Date:	Sun, 19 Oct 2014 16:35:05 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Al Viro <viro@...iv.linux.org.uk>
Cc:	David Drysdale <drysdale@...gle.com>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Meredydd Luff <meredydd@...atehouse.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kees Cook <keescook@...omium.org>,
	Arnd Bergmann <arnd@...db.de>, X86 ML <x86@...nel.org>,
	linux-arch <linux-arch@...r.kernel.org>,
	Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCHv4 RESEND 0/3] syscalls,x86: Add execveat() system call

On Sun, Oct 19, 2014 at 3:42 PM, Al Viro <viro@...iv.linux.org.uk> wrote:
> On Sun, Oct 19, 2014 at 03:16:03PM -0700, Andy Lutomirski wrote:
>
>> Oh, you mean that #!/usr/bin/make -f would turn into /usr/bin/make
>> /dev/fd/3?  That could be interesting, although I can imagine it
>> breaking things, especially if /dev/fd/3 isn't set up like that, e.g.
>> early in boot.
>
> Sigh...  What I mean is that fexecve(fd, ...) would have to put _something_
> into argv when it execs the interpreter of #! file.  Simply because the
> interpreter (which can be anything whatsoever) has no fscking idea what
> to do with some descriptor it has before execve().  Hell, it doesn't have
> any idea *which* descriptor had it been.
>
> You need to put some pathname that would yield your script upon open(2).
> If you bothered to read those patches, you'd see that they do supply
> one, generating it with d_path().  Which isn't particulary reliable.
>
> I'm not sure there's any point putting any of that in the kernel - if
> you *do* have that pathname, you can just pass it.

Hmm.

This issue certainly makes fexecve or execveat less attractive, at
least in cases where d_path won't work.

On the other hand, if you want to run a static binary on a cloexec fd
(or, for that matter, a dynamic binary if you trust the interpreter to
close the extra copy of the fd it gets) in a namespace or chroot where
the binary is invisible, then you need kernel help.

It's too bad that script interpreters don't have a mechanism to open
their scripts by fd.

>
>> Aside from the general scariness of allowing one process to actually
>> dup another process's fds, I feel like this is asking for trouble wrt
>> the various types of file locks.
>
> Who said anything about another process's fds?  That, indeed, would be
> a recipe for serious trouble.  It's a filesystem with one directory,
> not with one directory for each process...
>

This still has issues with locks if you pass an fd to a child process,
but I guess that you get what you ask for if you do that.

> FWIW, they (Plan 9) do have procfs and there they have /proc/<pid>/fd.
> Which is a regular file, with contents consisting of \n-terminated
> lines (one per descriptor in <pid>'s descriptor table>) in the same
> format as in *ctl (they put descriptor number as the first field in
> those).
>
> Unlike our solution, they do not allow to get to any process' files via
> procfs.  They do allow /dev/stdin-style access to your own files via
> dupfs.  And yes, for /dev/stdin and friends dup-style semantics is better -
> you get consistent behaviours for pipes and redirects from file that way.
> See the example I've posted upthread.  Besides, for things like sockets
> our semantics simply fails - they really depend on having only one
> struct file for given socket; it's dup or nothing there.  The same goes
> for a lot of things like eventfd, etc.

Fair enough.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/