[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f4d24e8024e84ec5a20ab17b6c2d7f60@AcuMS.aculab.com>
Date: Thu, 23 Mar 2023 22:16:12 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Mark Rutland' <mark.rutland@....com>,
Catalin Marinas <catalin.marinas@....com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"agordeev@...ux.ibm.com" <agordeev@...ux.ibm.com>,
"aou@...s.berkeley.edu" <aou@...s.berkeley.edu>,
"bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"gor@...ux.ibm.com" <gor@...ux.ibm.com>,
"hca@...ux.ibm.com" <hca@...ux.ibm.com>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"linux@...linux.org.uk" <linux@...linux.org.uk>,
"mingo@...hat.com" <mingo@...hat.com>,
"palmer@...belt.com" <palmer@...belt.com>,
"paul.walmsley@...ive.com" <paul.walmsley@...ive.com>,
"robin.murphy@....com" <robin.murphy@....com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"will@...nel.org" <will@...nel.org>
Subject: RE: [PATCH v2 1/4] lib: test copy_{to,from}_user()
From: Mark Rutland
> Sent: 22 March 2023 14:05
....
> > IIUC, in such tests you only vary the destination offset. Our copy
> > routines in general try to align the source and leave the destination
> > unaligned for performance. It would be interesting to add some variation
> > on the source offset as well to spot potential issues with that part of
> > the memcpy routines.
>
> I have that on my TODO list; I had intended to drop that into the
> usercopy_params. The only problem is that the cross product of size,
> src_offset, and dst_offset gets quite large.
I thought that is was better to align the writes and do misaligned reads.
Although maybe copy_to/from_user() would be best aligning the user address
(to avoid page faults part way through a misaligned access).
OTOH, on x86, is it even worth bothering at all.
I have measured a performance drop for misaligned reads, but it
was less than 1 clock per cache line in a test that was doing
2 misaligned reads in at least some of the clock cycles.
I think the memory read path can do two AVX reads each clock.
So doing two misaligned 64bit reads isn't stressing it.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists