lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whf9qLO8ipps4QhmS0BkM8mtWJhvnuDSdtw5gFjhzvKNA@mail.gmail.com>
Date: Tue, 9 Jan 2024 19:54:30 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Josh Triplett <josh@...htriplett.org>, Al Viro <viro@...iv.linux.org.uk>
Cc: Kees Cook <kees@...nel.org>, Kees Cook <keescook@...omium.org>, linux-kernel@...r.kernel.org, 
	Alexey Dobriyan <adobriyan@...il.com>
Subject: Re: [GIT PULL] execve updates for v6.8-rc1

On Tue, 9 Jan 2024 at 18:21, Josh Triplett <josh@...htriplett.org> wrote:
>
> Instead, here are some numbers from Linus's suggested benchmark
> (modified to use execvpe, and to count down rather than up so it doesn't
> need two arguments; modified version and benchmark driver script
> attached; compiled with `musl-gcc -Wall -O3 -s -static`):

Yeah, that's better. I had actually only benchmarked the success case.

And I see what the problem is: the "sane" way that only opens the
pathname once does so using

        file = do_filp_open(fd, name, &open_exec_flags);

and the "open path twice" does the first open with

        retval = filename_lookup(AT_FDCWD, filename, 0, &path, NULL);

and guess what the difference is?

The difference is that path_openat() starts out with

        file = alloc_empty_file(op->open_flag, current_cred());

and when the open fails, it will free the file with

        fput(file);

So if there are a lot of failures (because "." is at the end of the
path), it will have done a fair amount of those useless file
allocations and frees.

And - not surprisingly - the "open once" is then faster if there are
*not* a lot of failures, when the executable is found early in the
PATH.

Now, there's no fundamental *reason* to do that alloc_empty_file()
early, except for how the code is structured.

It partly makes the error handling simpler and since all the cases
want the filp in the end, doing it at the top means that it's only
done once.

And we occasionally do use the file pointer early (ie lookup_open()
will clear/set FMODE_CREATED in it even if it doesn't otherwise touch
the file pointer) even before the final lookup - and at creation time,
atomic_open() will actually want it for the final lookup.

Anyway, the real fix is definitely to just fix do_filp_open() to at
least not be *too* eager to allocate a file pointer.

In fact, that code is oddly non-optimal for another reason: it does
that "allocate and free file" not just when the path lookup fails, but
it does it for things like RCU lookup failures too.

So what happens is that if RCU lookup fails, do_filp_open() will call
path_openat() twice: first with LOOKUP_RCU, and then without it. And
path_openat() will allocate that "struct file *" twice.

On NFS - or other filesystems that can return ESTALE - it will in fact
do it three times.

That's pretty disgusting.

Al, comments? We *could* just special-case the execve() code not to
use do_filp_open() and avoid this issue that way, but it does feel
like even the regular open() case is pessimal with that whole RCU
situation.

                Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ