[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0ed69079-9e13-a0f4-776c-1f24faa9daec@redhat.com>
Date: Thu, 26 Aug 2021 23:47:07 +0200
From: David Hildenbrand <david@...hat.com>
To: Andy Lutomirski <luto@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
David Laight <David.Laight@...lab.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
Al Viro <viro@...iv.linux.org.uk>,
Alexey Dobriyan <adobriyan@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Petr Mladek <pmladek@...e.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
Kees Cook <keescook@...omium.org>,
Greg Ungerer <gerg@...ux-m68k.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Mike Rapoport <rppt@...nel.org>,
Vlastimil Babka <vbabka@...e.cz>,
Vincenzo Frascino <vincenzo.frascino@....com>,
Chinwen Chang <chinwen.chang@...iatek.com>,
Michel Lespinasse <walken@...gle.com>,
Catalin Marinas <catalin.marinas@....com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Huang Ying <ying.huang@...el.com>,
Jann Horn <jannh@...gle.com>, Feng Tang <feng.tang@...el.com>,
Kevin Brodsky <Kevin.Brodsky@....com>,
Michael Ellerman <mpe@...erman.id.au>,
Shawn Anastasio <shawn@...stas.io>,
Steven Price <steven.price@....com>,
Nicholas Piggin <npiggin@...il.com>,
Christian Brauner <christian.brauner@...ntu.com>,
Jens Axboe <axboe@...nel.dk>,
Gabriel Krisman Bertazi <krisman@...labora.com>,
Peter Xu <peterx@...hat.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
Marco Elver <elver@...gle.com>,
Daniel Jordan <daniel.m.jordan@...cle.com>,
Nicolas Viennot <Nicolas.Viennot@...sigma.com>,
Thomas Cedeno <thomascedeno@...gle.com>,
Collin Fijalkovich <cfijalkovich@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Miklos Szeredi <miklos@...redi.hu>,
Chengguang Xu <cgxu519@...ernel.net>,
Christian König <ckoenig.leichtzumerken@...il.com>,
"linux-unionfs@...r.kernel.org" <linux-unionfs@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
linux-fsdevel@...r.kernel.org, Linux-MM <linux-mm@...ck.org>,
Florian Weimer <fweimer@...hat.com>,
Michael Kerrisk <mtk.manpages@...il.com>
Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE
On 26.08.21 19:48, Andy Lutomirski wrote:
> On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote:
>> On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski <luto@...nel.org> wrote:
>>>
>>> I’ll bite. How about we attack this in the opposite direction: remove the deny write mechanism entirely.
>>
>> I think that would be ok, except I can see somebody relying on it.
>>
>> It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_ time.
>
> Someone off-list just pointed something out to me, and I think we should push harder to remove ETXTBSY. Specifically, we've all been focused on open() failing with ETXTBSY, and it's easy to make fun of anyone opening a running program for write when they should be unlinking and replacing it.
>
> Alas, Linux's implementation of deny_write_access() is correct^Wabsurd, and deny_write_access() *also* returns ETXTBSY if the file is open for write. So, in a multithreaded program, one thread does:
>
> fd = open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC);
> write(fd, some stuff);
>
> <--- problem is here
>
> close(fd);
> execve("some exefile");
>
> Another thread does:
>
> fork();
> execve("something else");
>
> In between fork and execve, there's another copy of the open file description, and i_writecount is held, and the execve() fails. Whoops. See, for example:
>
> https://github.com/golang/go/issues/22315
>
> I propose we get rid of deny_write_access() completely to solve this.
>
> Getting rid of i_writecount itself seems a bit harder, since a handful of filesystems use it for clever reasons.
>
> (OFD locks seem like they might have the same problem. Maybe we should have a clone() flag to unshare the file table and close close-on-exec things?)
>
It's not like this issue is new (^2017) or relevant in practice. So no
need to hurry IMHO. One step at a time: it might make perfect sense to
remove ETXTBSY, but we have to be careful to not break other user space
that actually cares about the current behavior in practice.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists