[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bfa4fd38-0874-63b3-991a-1102af9f47a6@huawei.com>
Date: Tue, 13 Apr 2021 20:54:55 +0800
From: Kemeng Shi <shikemeng@...wei.com>
To: Borislav Petkov <bp@...en8.de>
CC: <tglx@...utronix.de>, <mingo@...hat.com>, <x86@...nel.org>,
<hpa@...or.com>, <linux-kernel@...r.kernel.org>,
<linux-nvdimm@...ts.01.org>
Subject: Re:Re: [PATCH] x86: Accelerate copy_page with non-temporal in X86
on 2021/4/13 19:01, Borislav Petkov wrote:
> + linux-nvdimm
>
> Original mail at https://lkml.kernel.org/r/3f28adee-8214-fa8e-b368-eaf8b193469e@huawei.com
>
> On Tue, Apr 13, 2021 at 02:25:58PM +0800, Kemeng Shi wrote:
>> I'm using AEP with dax_kmem drvier, and AEP is export as a NUMA node in
>
> What is AEP?
>
AEP is a type of persistent memory produced by Intel. It's slower than
normal memory but is persistent.
>> my system. I will move cold pages from DRAM node to AEP node with
>> move_pages system call. With old "rep movsq', it costs 2030ms to move
>> 1 GB pages. With "movnti", it only cost about 890ms to move 1GB pages.
>
> So there's __copy_user_nocache() which does NT stores.
>
>> - ALTERNATIVE "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD
>> + ALTERNATIVE_2 "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD, \
>> + "jmp copy_page_nt", X86_FEATURE_XMM2
>
> This makes every machine which has sse2 do NT stores now. Which means
> *every* machine practically.
>
Yes. And NT stores should be better for copy_page especially copying a lot
of pages as only partial memory of copied page will be access recently.
> The folks on linux-nvdimm@ should be able to give you a better idea what
> to do.
>
> HTH.
>
Thanks for response and help.
Powered by blists - more mailing lists