linux-kernel - Re: [PATCH v2 0/2] Replace and improve "mcsafe" with copy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVAsppM5kRz0HicAQ8o_x06=7Nd0q64sEre3MEShWPaLw@mail.gmail.com>
Date:   Mon, 4 May 2020 13:26:05 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     "Luck, Tony" <tony.luck@...el.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>,
        stable <stable@...r.kernel.org>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Paul Mackerras <paulus@...ba.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        "Tsaur, Erwin" <erwin.tsaur@...el.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        linux-nvdimm <linux-nvdimm@...ts.01.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()

On Mon, May 4, 2020 at 1:05 PM Luck, Tony <tony.luck@...el.com> wrote:
>
> > When a copy function hits a bad page and the page is not yet known to
> > be bad, what does it do?  (I.e. the page was believed to be fine but
> > the copy function gets #MC.)  Does it unmap it right away?  What does
> > it return?
>
> I suspect that we will only ever find a handful of situations where the
> kernel can recover from memory that has gone bad that are worth fixing
> (got to be some code path that touches a meaningful fraction of memory,
> otherwise we get code complexity without any meaningful payoff).
>
> I don't think we'd want different actions for the cases of "we just found out
> now that this page is bad" and "we got a notification an hour ago that this
> page had gone bad". Currently we treat those the same for application
> errors ... SIGBUS either way[1].

Oh, I agree that the end result should be the same.  I'm thinking more
about the mechanism and the internal API.  As a somewhat silly example
of why there's a difference, the first time we try to read from bad
memory, we can expect #MC (I assume, on a sensibly functioning
platform).  But, once we get the #MC, I imagine that the #MC handler
will want to unmap the page to prevent a storm of additional #MC
events on the same page -- given the awful x86 #MC design, too many
all at once is fatal.  So the next time we copy_mc_to_user() or
whatever from the memory, we'll get #PF instead.  Or maybe that #MC
will defer the unmap?

So the point of my questions is that the overall design should be at
least somewhat settled before anyone tries to review just the copy
functions.