lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180910131800.GA41487@gmail.com>
Date:   Mon, 10 Sep 2018 15:18:00 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Mikulas Patocka <mpatocka@...hat.com>
Cc:     Mike Snitzer <snitzer@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Dan Williams <dan.j.williams@...el.com>,
        device-mapper development <dm-devel@...hat.com>,
        X86 ML <x86@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 RESEND] x86: optimize memcpy_flushcache


* Mikulas Patocka <mpatocka@...hat.com> wrote:

> Here I resend it:
> 
> 
> From: Mikulas Patocka <mpatocka@...hat.com>
> Subject: [PATCH] x86: optimize memcpy_flushcache
> 
> I use memcpy_flushcache in my persistent memory driver for metadata
> updates, there are many 8-byte and 16-byte updates and it turns out that
> the overhead of memcpy_flushcache causes 2% performance degradation
> compared to "movnti" instruction explicitly coded using inline assembler.
> 
> The tests were done on a Skylake processor with persistent memory emulated
> using the "memmap" kernel parameter. dd was used to copy data to the
> dm-writecache target.
> 
> This patch recognizes memcpy_flushcache calls with constant short length
> and turns them into inline assembler - so that I don't have to use inline
> assembler in the driver.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@...hat.com>
> 
> ---
>  arch/x86/include/asm/string_64.h |   20 +++++++++++++++++++-
>  arch/x86/lib/usercopy_64.c       |    4 ++--
>  2 files changed, 21 insertions(+), 3 deletions(-)

Applied to tip:x86/asm, thanks!

I'll push it out later today after some testing.

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ