[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4gO0NWRLs0_A7L+2pX+v0ohoYRSgvtw_VXmBesgOnQyUw@mail.gmail.com>
Date: Mon, 10 Apr 2017 14:35:36 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: "Kani, Toshimitsu" <toshi.kani@....com>
Cc: "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
Jan Kara <jack@...e.cz>,
Matthew Wilcox <mawilcox@...rosoft.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
Christoph Hellwig <hch@....de>, Jeff Moyer <jmoyer@...hat.com>,
Ingo Molnar <mingo@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ross Zwisler <ross.zwisler@...ux.intel.com>
Subject: Re: [PATCH v2] x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions
On Mon, Apr 10, 2017 at 2:28 PM, Kani, Toshimitsu <toshi.kani@....com> wrote:
>> > Thanks for the update. I think the alignment check should be based on
>> > the following note in copy_user_nocache.
>> >
>> > * Note: Cached memory copy is used when destination or size is not
>> > * naturally aligned. That is:
>> > * - Require 8-byte alignment when size is 8 bytes or larger.
>> > * - Require 4-byte alignment when size is 4 bytes.
>> >
>> > So, I think the code may be something like this. I also made the following
>> changes:
>>
>> Thanks!
>>
>> > - Mask with 7, not 8.
>>
>> Yes, good catch.
>>
>> > - ALIGN with cacheline size, instead of 8.
>> > - Add (bytes > flushed) test since calculation with unsigned long still results
>> in a negative
>> > value (as a positive value).
>> >
>> > if (bytes < 8) {
>> > if ((dest & 3) || (bytes != 4))
>> > arch_wb_cache_pmem(addr, 1);
>> > } else {
>> > if (dest & 7) {
>> > dest = ALIGN(dest, boot_cpu_data.x86_clflush_size);
>>
>> Why align the destination to the next cacheline? As far as I can see
>> the ALIGN_DESTINATION macro in arch/x86/include/asm/asm.h only aligns
>> to the next 8-byte boundary.
>
> The clflush here flushes for the cacheline size. So, we do not need to flush
> the same cacheline again when the unaligned tail is in the same line.
Ok, makes sense. Last question, can't we reduce the check to be:
if ((bytes > flushed) && ((bytes - flushed) & 3))
...since if 'bytes' was 4-byte aligned we would have performed
non-temporal stores.
Can I add your Signed-off-by: on v3?
Powered by blists - more mailing lists