lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87eeay8pqx.fsf@disp2133>
Date:   Thu, 12 Aug 2021 13:47:02 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Andy Lutomirski <luto@...nel.org>,
        David Hildenbrand <david@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        "Peter Zijlstra \(Intel\)" <peterz@...radead.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Kees Cook <keescook@...omium.org>,
        Greg Ungerer <gerg@...ux-m68k.org>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Mike Rapoport <rppt@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Chinwen Chang <chinwen.chang@...iatek.com>,
        Michel Lespinasse <walken@...gle.com>,
        Catalin Marinas <catalin.marinas@....com>,
        "Matthew Wilcox \(Oracle\)" <willy@...radead.org>,
        Huang Ying <ying.huang@...el.com>,
        Jann Horn <jannh@...gle.com>, Feng Tang <feng.tang@...el.com>,
        Kevin Brodsky <Kevin.Brodsky@....com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Shawn Anastasio <shawn@...stas.io>,
        Steven Price <steven.price@....com>,
        Nicholas Piggin <npiggin@...il.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Jens Axboe <axboe@...nel.dk>,
        Gabriel Krisman Bertazi <krisman@...labora.com>,
        Peter Xu <peterx@...hat.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Marco Elver <elver@...gle.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>,
        Nicolas Viennot <Nicolas.Viennot@...sigma.com>,
        Thomas Cedeno <thomascedeno@...gle.com>,
        Collin Fijalkovich <cfijalkovich@...gle.com>,
        Michal Hocko <mhocko@...e.com>,
        Miklos Szeredi <miklos@...redi.hu>,
        Chengguang Xu <cgxu519@...ernel.net>,
        Christian König 
        <ckoenig.leichtzumerken@...il.com>, linux-unionfs@...r.kernel.org,
        Linux API <linux-api@...r.kernel.org>,
        "the arch\/x86 maintainers" <x86@...nel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH v1 0/7] Remove in-tree usage of MAP_DENYWRITE

Linus Torvalds <torvalds@...ux-foundation.org> writes:

> On Thu, Aug 12, 2021 at 7:48 AM Eric W. Biederman <ebiederm@...ssion.com> wrote:
>>
>> Given that MAP_PRIVATE for shared libraries is our strategy for handling
>> writes to shared libraries perhaps we just need to use MAP_POPULATE or a
>> new related flag (perhaps MAP_PRIVATE_NOW)
>
> No. That would be horrible for the usual bloated GUI libraries. It
> might help some (dynamic page faults are not cheap either), but it
> would hurt a lot.

I wasn't aiming so much at the MAP_POPULATE part but something that
would trigger cow from writes to the file.  I see code that is close but
I don't see any code in the kernel that would implement that currently.

Upon reflection I think it will always be difficult to trigger cow from
the file write side of the kernel as code that would cow the page in
the page cache would cause problems with writable mmaps.

> This is definitely a "if you overwrite a system library while it's
> being used, you get to keep both pieces" situation.
>
> The kernel ETXTBUSY thing is purely a courtesy feature, and as people
> have noticed it only really works for the main executable because of
> various reasons. It's not something user space should even rely on,
> it's more of a "ok, you're doing something incredibly stupid, and
> we'll help you avoid shooting yourself in the foot when we notice".
>
> Any distro should make sure their upgrade tools don't just
> truncate/write to random libraries executables.

Yes.  I am trying to come up with advice on how userspace
implementations can implement their tools to use other mechanisms that
solve the overwriting shared libaries and executables problem that
are not broken by design.

For a little bit the way Florian Weirmer was talking and the fact that
uselib uses MAP_PRIVATE had me thinking that somehow MAP_PRIVATE could
be part of the solution.  I have now looked into the implementation of
MAP_PRIVATE and I since we don't perform the cow on filesystem writes
MAP_PRIVATE absolutely can not be part of the solution we recommend to
userspace.

So today the best advice I can give to userspace is to mark their
executables and shared libraries as read-only and immutable.  Otherwise
a change to the executable file can change what is mapped into memory.
MAP_PRIVATE does not help.

> And if they do, it's really not a kernel issue.

What is a kernel issue is giving people good advice on how to use kernel
features to solve real world problems.  I have seen the write to a
mapped exectuable/shared lib problem, and Florian has seen it.  So while
rare the problem is real and a pain to debug.

> This patch series basically takes this very historical error return,
> and simplifies and clarifies the implementation, and in the process
> might change some very subtle corner case (unmapping the original
> executable entirely?). I hope (and think) it wouldn't matter exactly
> because this is a "courtesy error" rather than anything that a sane
> setup would _depend_ on, but hey, insane setups clearly exist.

Oh yes.

I very much agree that the design of this patchset is perfectly fine.

I also see that MAP_DENYWRITE is unfortunately broken by design.  I
vaguely remember the discussion when MAP_DENYWRITE was made a noop
because of the denial-of-service aspect of MAP_DENYWRITE.

I very much agree that we should strongly encourage userspace not
to write to mmaped files.

As I am learning with my two year old, it helps to give a constructive
suggestion of alternative behavior instead of just saying no.
Florian reported that there remains a problem in userspace. So I am
coming up with a constructive suggestion.  My apologies for going off
into the weeds for a moment.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ