[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ea44458b90b4d41a08ba9012818d273@AcuMS.aculab.com>
Date: Thu, 22 Nov 2018 17:36:37 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Denys Vlasenko' <dvlasenk@...hat.com>,
Jens Axboe <axboe@...nel.dk>, "Ingo Molnar" <mingo@...nel.org>
CC: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
"the arch/x86 maintainers" <x86@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Brian Gerst <brgerst@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>
Subject: RE: [PATCH] x86: only use ERMS for user copies for larger sizes
From: Denys Vlasenko
> Sent: 21 November 2018 13:44
...
> I also tested this while working for string ops code in musl.
>
> I think at least 128 bytes would be the minimum where "REP insn"
> are more efficient. In my testing, it's more like 256 bytes...
What happens for misaligned copies?
I had a feeling that the ERMS 'reb movsb' code used some kind
of barrel shifter in that case.
The other problem with the ERMS copy is that it gets used
for copy_to/from_io() - and the 'rep movsb' on uncached
locations has to do byte copies.
Byte reads on PCIe are really horrid.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists