lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a6dbc5b3-b12e-36b4-0aef-f319264d6e8f@redhat.com>
Date:   Wed, 1 Sep 2021 10:28:00 +0200
From:   David Hildenbrand <david@...hat.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        David Laight <David.Laight@...lab.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Kees Cook <keescook@...omium.org>,
        Greg Ungerer <gerg@...ux-m68k.org>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Mike Rapoport <rppt@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Chinwen Chang <chinwen.chang@...iatek.com>,
        Michel Lespinasse <walken@...gle.com>,
        Catalin Marinas <catalin.marinas@....com>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        Huang Ying <ying.huang@...el.com>,
        Jann Horn <jannh@...gle.com>, Feng Tang <feng.tang@...el.com>,
        Kevin Brodsky <Kevin.Brodsky@....com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Shawn Anastasio <shawn@...stas.io>,
        Steven Price <steven.price@....com>,
        Nicholas Piggin <npiggin@...il.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Jens Axboe <axboe@...nel.dk>,
        Gabriel Krisman Bertazi <krisman@...labora.com>,
        Peter Xu <peterx@...hat.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Marco Elver <elver@...gle.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>,
        Nicolas Viennot <Nicolas.Viennot@...sigma.com>,
        Thomas Cedeno <thomascedeno@...gle.com>,
        Collin Fijalkovich <cfijalkovich@...gle.com>,
        Michal Hocko <mhocko@...e.com>,
        Miklos Szeredi <miklos@...redi.hu>,
        Chengguang Xu <cgxu519@...ernel.net>,
        Christian König <ckoenig.leichtzumerken@...il.com>,
        "linux-unionfs@...r.kernel.org" <linux-unionfs@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        linux-fsdevel@...r.kernel.org, Linux-MM <linux-mm@...ck.org>,
        Florian Weimer <fweimer@...hat.com>,
        Michael Kerrisk <mtk.manpages@...il.com>
Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE

On 27.08.21 00:13, Eric W. Biederman wrote:
> David Hildenbrand <david@...hat.com> writes:
> 
>> On 26.08.21 19:48, Andy Lutomirski wrote:
>>> On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote:
>>>> On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski <luto@...nel.org> wrote:
>>>>>
>>>>> I’ll bite.  How about we attack this in the opposite direction: remove the deny write mechanism entirely.
>>>>
>>>> I think that would be ok, except I can see somebody relying on it.
>>>>
>>>> It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_ time.
>>>
>>> Someone off-list just pointed something out to me, and I think we should push harder to remove ETXTBSY.  Specifically, we've all been focused on open() failing with ETXTBSY, and it's easy to make fun of anyone opening a running program for write when they should be unlinking and replacing it.
>>>
>>> Alas, Linux's implementation of deny_write_access() is correct^Wabsurd, and deny_write_access() *also* returns ETXTBSY if the file is open for write.  So, in a multithreaded program, one thread does:
>>>
>>> fd = open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC);
>>> write(fd, some stuff);
>>>
>>> <--- problem is here
>>>
>>> close(fd);
>>> execve("some exefile");
>>>
>>> Another thread does:
>>>
>>> fork();
>>> execve("something else");
>>>
>>> In between fork and execve, there's another copy of the open file description, and i_writecount is held, and the execve() fails.  Whoops.  See, for example:
>>>
>>> https://github.com/golang/go/issues/22315
>>>
>>> I propose we get rid of deny_write_access() completely to solve this.
>>>
>>> Getting rid of i_writecount itself seems a bit harder, since a handful of filesystems use it for clever reasons.
>>>
>>> (OFD locks seem like they might have the same problem.  Maybe we should have a clone() flag to unshare the file table and close close-on-exec things?)
>>>
>>
>> It's not like this issue is new (^2017) or relevant in practice. So no
>> need to hurry IMHO. One step at a time: it might make perfect sense to
>> remove ETXTBSY, but we have to be careful to not break other user
>> space that actually cares about the current behavior in practice.
> 
> It is an old enough issue that I agree there is no need to hurry.
> 
> I also ran into this issue not too long ago when I refactored the
> usermode_driver code.  My challenge was not being in userspace
> the delayed fput was not happening in my kernel thread.  Which meant
> that writing the file, then closing the file, then execing the file
> consistently reported -ETXTBSY.
> 
> The kernel code wound up doing:
> 	/* Flush delayed fput so exec can open the file read-only */
> 	flush_delayed_fput();
> 	task_work_run();
> 
> As I read the code the delay for userspace file descriptors is
> always done with task_work_add, so userspace should not hit
> that kind of silliness, and should be able to actually close
> the file descriptor before the exec.
> 
> 
> On the flip side, I don't know how anything can depend upon getting an
> -ETXTBSY.  So I don't think there is any real risk of breaking userspace
> if we remove it.

At least in LTP, we have two test cases testing exactly that behavior:

testcases/kernel/syscalls/creat/creat07.c
testcases/kernel/syscalls/execve/execve04.c


-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ