[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4jQ3s_ZVRvw6jAmm3vcebc-Ucf7FHYP3_nTybwdfQeG8Q@mail.gmail.com>
Date: Sun, 19 Apr 2020 22:08:23 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andy Lutomirski <luto@...capital.net>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, X86 ML <x86@...nel.org>,
stable <stable@...r.kernel.org>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Tony Luck <tony.luck@...el.com>,
Erwin Tsaur <erwin.tsaur@...el.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-nvdimm <linux-nvdimm@...ts.01.org>
Subject: Re: [PATCH] x86/memcpy: Introduce memcpy_mcsafe_fast
On Sat, Apr 18, 2020 at 1:52 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Sat, Apr 18, 2020 at 1:30 PM Andy Lutomirski <luto@...capital.net> wrote:
> >
> > Maybe I’m missing something obvious, but what’s the alternative? The _mcsafe variants don’t just avoid the REP mess — they also tell the kernel that this particular access is recoverable via extable.
>
> .. which they could easily do exactly the same way the user space
> accessors do, just with a much simplified model that doesn't even care
> about multiple sizes, since unaligned accesses weren't valid anyway.
>
> The thing is, all of the MCS code has been nasty. There's no reason
> for it what-so-ever that I can tell. The hardware has been so
> incredibly broken that it's basically unusable, and most of the
> software around it seems to have been about testing.
>
> So I absolutely abhor that thing. Everything about that code has
> screamed "yeah, we completely mis-designed the hardware, we're pushing
> the problems into software, and nobody even uses it or can test it so
> there's like 5 people who care".
>
> And I'm pushing back on it, because I think that the least the code
> can do is to at least be simple.
>
> For example, none of those optimizations should exist. That function
> shouldn't have been inline to begin with. And if it really really
> matters from a performance angle that it was inline (which I doubt),
> it shouldn't have looked like a memory copy, it should have looked
> like "get_user()" (except without all the complications of actually
> having to test addresses or worry about different sizes).
>
>
> And it almost certainly shouldn't have been done in low-level asm
> either. It could have been a single "read aligned word" interface
> using an inline asm, and then everything else could have been done as
> C code around it.
Do we have examples of doing exception handling from C? I thought all
the exception handling copy routines were assembly routines?
>
> But no. The software side is almost as messy as the hardware side is.
> I hate it. And since nobody sane can test it, and the broken hardware
> is _so_ broken than nobody should ever use it, I have continually
> pushed back against this kind of ugly nasty special code.
>
> We know the writes can't fault, since they are buffered. So they
> aren't special at all.
The writes can mmu-fault now that memcpy_mcsafe() is also used by
_copy_to_iter_mcsafe(). This allows a clean bypass of the block layer
in fs/dax.c in addition to the pmem driver access of poisoned memory.
Now that the fallback is a sane rep; movs; it can be considered for
plain copy_to_iter() for other user copies so you get exception
handling on kernel access of poison outside of persistent memory. To
Andy's point I think a recoverable copy (for exceptions or faults) is
generally useful.
> We know the acceptable reads for the broken hardware basically boil
> down to a single simple word-size aligned read, so you need _one_
> special inline asm for that. The rest of the cases can be handled by
> masking and shifting if you really really need to - and done better
> that way than with byte accesses anyway.
>
> Then you have _one_ C file that implements everything using that
> single operation (ok, if people absolutely want to do sizes, I guess
> they can if they can just hide it in that one file), and you have one
> header file that exposes the interfaces to it, and you're done.
>
> And you strive hard as hell to not impact anything else, because you
> know that the hardware is unacceptable until all those special rules
> go away. Which they apparently finally have.
I understand the gripes about the mcsafe_slow() implementation, but
how do I implement mcsafe_fast() any better than how it is currently
organized given that, setting aside machine check handling,
memcpy_mcsafe() is the core of a copy_to_iter*() front-end that can
mmu-fault on either source or destination access?
Powered by blists - more mailing lists