lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <9fc41af45fcb40e3ae607eb4f52d7ef9@AcuMS.aculab.com> Date: Tue, 8 Feb 2022 11:45:38 +0000 From: David Laight <David.Laight@...LAB.COM> To: 'Hugh Dickins' <hughd@...gle.com>, Borislav Petkov <bp@...e.de> CC: Peter Zijlstra <peterz@...radead.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org> Subject: RE: x86: should clear_user() have alternatives? From: Hugh Dickins > Sent: 08 February 2022 05:46 > > I've noticed that clear_user() is slower than it need be: > > dd if=/dev/zero of=/dev/null bs=1M count=1M > 1099511627776 bytes (1.1 TB) copied, 45.9641 s, 23.9 GB/s > whereas with the hacked patch below > 1099511627776 bytes (1.1 TB) copied, 33.4 s, 32.9 GB/s > > That was on some Intel machine: IIRC an AMD went faster. > > It's because clear_user() lacks alternatives, and uses a > nowadays suboptimal implementation; whereas clear_page() > and copy_user() do support alternatives. > ... > +SYM_FUNC_START(__clear_user) > + ASM_STAC > + movl %esi,%ecx > + xorq %rax,%rax > +1: rep stosb > +2: movl %ecx,%eax > + ASM_CLAC > + ret You only want to even consider than version for long copies (and possibly only for aligned ones). The existing code (I've not quoted) does look sub-optimal though. It should be easy to obtain a write every clock. But I suspect the loop is too long. The code gcc generates might even be better! Note that for copies longer than 8 bytes 'odd' lengths can be handled by a single misaligned write to the end of the buffer. No need for a byte copy loop. I've not experimented with misaligned writes - they might take two clocks. So it might be worth aligning them - but they may not happen often enough for it to be an overall gain. Misaligned reads usually don't make any difference. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Powered by blists - more mailing lists