[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4idV+hp-W56gyQDN4p9SQsYz+xondgVJwQSYphUMxkYnw@mail.gmail.com>
Date: Tue, 1 May 2018 21:00:37 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
Tony Luck <tony.luck@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>,
"the arch/x86 maintainers" <x86@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andy Lutomirski <luto@...capital.net>,
Ingo Molnar <mingo@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/6] use memcpy_mcsafe() for copy_to_iter()
On Tue, May 1, 2018 at 8:33 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Tue, May 1, 2018 at 8:22 PM Dan Williams <dan.j.williams@...el.com>
> wrote:
>
>> All that to say that having a typical RAM page covering poisoned pmem
>> would complicate the 'clear badblocks' implementation.
>
> Ugh, ok.
>
> I guess the good news is that your patches aren't so big, and don't really
> affect anything else.
>
> But can we at least take this to be the impetus for just getting rid of
> that disgusting unrolled memcpy? Ablout half of the lines in the patch set
> comes from that thing.
>
> Is anybody seriously going to use pmem with some in-order chip that can't
> even get something as simple as a memory copy loop right? "git blame"
> fingers Tony Luck, I think he may have been influenced by the fumes from
> Itanium.
>
> I have some dim memory of "rep movs doesn't work well for pmem", but does
> it *seriously* need unrolling to cacheline boundaries? And if it does, who
> designed it, and why is anybody using it?
>
I think this is an FAQ from the original submission, in fact some guy
named "Linus Torvalds" asked [1]:
---
> - why does this use the complex - and slower, on modern machines -
> unrolled manual memory copy, when you might as well just use a single
>
> rep ; movsb
>
> which not only makes it smaller, but makes the exception fixup trivial.
Because current generation cpus don't give a recoverable machine
check if we consume with a "rep ; movsb" :-(
When we have that we can pick the best copy function based
on the capabilities of the cpu we are running on.
---
[1]: https://lkml.org/lkml/2016/2/18/608
Powered by blists - more mailing lists