[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <b60e9bd1-7232-472d-9c9c-1d6593e9e85e@www.fastmail.com>
Date: Thu, 26 Aug 2021 10:48:55 -0700
From: "Andy Lutomirski" <luto@...nel.org>
To: "Linus Torvalds" <torvalds@...ux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
"David Laight" <David.Laight@...lab.com>,
"David Hildenbrand" <david@...hat.com>,
"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
"Andrew Morton" <akpm@...ux-foundation.org>,
"Thomas Gleixner" <tglx@...utronix.de>,
"Ingo Molnar" <mingo@...hat.com>, "Borislav Petkov" <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
"Al Viro" <viro@...iv.linux.org.uk>,
"Alexey Dobriyan" <adobriyan@...il.com>,
"Steven Rostedt" <rostedt@...dmis.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
"Arnaldo Carvalho de Melo" <acme@...nel.org>,
"Mark Rutland" <mark.rutland@....com>,
"Alexander Shishkin" <alexander.shishkin@...ux.intel.com>,
"Jiri Olsa" <jolsa@...hat.com>,
"Namhyung Kim" <namhyung@...nel.org>,
"Petr Mladek" <pmladek@...e.com>,
"Sergey Senozhatsky" <sergey.senozhatsky@...il.com>,
"Andy Shevchenko" <andriy.shevchenko@...ux.intel.com>,
"Rasmus Villemoes" <linux@...musvillemoes.dk>,
"Kees Cook" <keescook@...omium.org>,
"Greg Ungerer" <gerg@...ux-m68k.org>,
"Geert Uytterhoeven" <geert@...ux-m68k.org>,
"Mike Rapoport" <rppt@...nel.org>,
"Vlastimil Babka" <vbabka@...e.cz>,
"Vincenzo Frascino" <vincenzo.frascino@....com>,
"Chinwen Chang" <chinwen.chang@...iatek.com>,
"Michel Lespinasse" <walken@...gle.com>,
"Catalin Marinas" <catalin.marinas@....com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
"Huang Ying" <ying.huang@...el.com>,
"Jann Horn" <jannh@...gle.com>, "Feng Tang" <feng.tang@...el.com>,
"Kevin Brodsky" <Kevin.Brodsky@....com>,
"Michael Ellerman" <mpe@...erman.id.au>,
"Shawn Anastasio" <shawn@...stas.io>,
"Steven Price" <steven.price@....com>,
"Nicholas Piggin" <npiggin@...il.com>,
"Christian Brauner" <christian.brauner@...ntu.com>,
"Jens Axboe" <axboe@...nel.dk>,
"Gabriel Krisman Bertazi" <krisman@...labora.com>,
"Peter Xu" <peterx@...hat.com>,
"Suren Baghdasaryan" <surenb@...gle.com>,
"Shakeel Butt" <shakeelb@...gle.com>,
"Marco Elver" <elver@...gle.com>,
"Daniel Jordan" <daniel.m.jordan@...cle.com>,
"Nicolas Viennot" <Nicolas.Viennot@...sigma.com>,
"Thomas Cedeno" <thomascedeno@...gle.com>,
"Collin Fijalkovich" <cfijalkovich@...gle.com>,
"Michal Hocko" <mhocko@...e.com>,
"Miklos Szeredi" <miklos@...redi.hu>,
"Chengguang Xu" <cgxu519@...ernel.net>,
Christian König <ckoenig.leichtzumerken@...il.com>,
"linux-unionfs@...r.kernel.org" <linux-unionfs@...r.kernel.org>,
"Linux API" <linux-api@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
"<linux-fsdevel@...r.kernel.org>" <linux-fsdevel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
"Florian Weimer" <fweimer@...hat.com>,
"Michael Kerrisk" <mtk.manpages@...il.com>
Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE
On Fri, Aug 13, 2021, at 5:54 PM, Linus Torvalds wrote:
> On Fri, Aug 13, 2021 at 2:49 PM Andy Lutomirski <luto@...nel.org> wrote:
> >
> > I’ll bite. How about we attack this in the opposite direction: remove the deny write mechanism entirely.
>
> I think that would be ok, except I can see somebody relying on it.
>
> It's broken, it's stupid, but we've done that ETXTBUSY for a _loong_ time.
Someone off-list just pointed something out to me, and I think we should push harder to remove ETXTBSY. Specifically, we've all been focused on open() failing with ETXTBSY, and it's easy to make fun of anyone opening a running program for write when they should be unlinking and replacing it.
Alas, Linux's implementation of deny_write_access() is correct^Wabsurd, and deny_write_access() *also* returns ETXTBSY if the file is open for write. So, in a multithreaded program, one thread does:
fd = open("some exefile", O_RDWR | O_CREAT | O_CLOEXEC);
write(fd, some stuff);
<--- problem is here
close(fd);
execve("some exefile");
Another thread does:
fork();
execve("something else");
In between fork and execve, there's another copy of the open file description, and i_writecount is held, and the execve() fails. Whoops. See, for example:
https://github.com/golang/go/issues/22315
I propose we get rid of deny_write_access() completely to solve this.
Getting rid of i_writecount itself seems a bit harder, since a handful of filesystems use it for clever reasons.
(OFD locks seem like they might have the same problem. Maybe we should have a clone() flag to unshare the file table and close close-on-exec things?)
>
> But you are right that we have removed parts of it over time (no more
> MAP_DENYWRITE, no more uselib()) so that what we have today is a
> fairly weak form of what we used to do.
>
> And nobody really complained when we weakened it, so maybe removing it
> entirely might be acceptable.
>
> Linus
>
Powered by blists - more mailing lists