[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1806211224470.19940@file01.intranet.prod.int.rdu2.redhat.com>
Date: Thu, 21 Jun 2018 21:19:27 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Ingo Molnar <mingo@...nel.org>
cc: Mike Snitzer <snitzer@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Dan Williams <dan.j.williams@...el.com>,
device-mapper development <dm-devel@...hat.com>,
X86 ML <x86@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 RESEND] x86: optimize memcpy_flushcache
On Thu, 21 Jun 2018, Ingo Molnar wrote:
>
> * Mike Snitzer <snitzer@...hat.com> wrote:
>
> > From: Mikulas Patocka <mpatocka@...hat.com>
> > Subject: [PATCH v2] x86: optimize memcpy_flushcache
> >
> > In the context of constant short length stores to persistent memory,
> > memcpy_flushcache suffers from a 2% performance degradation compared to
> > explicitly using the "movnti" instruction.
> >
> > Optimize 4, 8, and 16 byte memcpy_flushcache calls to explicitly use the
> > movnti instruction with inline assembler.
>
> Linus requested asm optimizations to include actual benchmarks, so it would be
> nice to describe how this was tested, on what hardware, and what the before/after
> numbers are.
>
> Thanks,
>
> Ingo
It was tested on 4-core skylake machine with persistent memory being
emulated using the memmap kernel option. The dm-writecache target used the
emulated persistent memory as a cache and sata SSD as a backing device.
The patch results in 2% improved throughput when writing data using dd.
I don't have access to the machine anymore.
Mikulas
Powered by blists - more mailing lists