[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190914092915.GA17409@avx2>
Date: Sat, 14 Sep 2019 12:29:15 +0300
From: Alexey Dobriyan <adobriyan@...il.com>
To: bp@...en8.de
Cc: linux-kernel@...r.kernel.org, mingo@...nel.org, x86@...nel.org,
linux@...musvillemoes.dk, torvalds@...ux-foundation.org
Subject: Re: [RFC] Improve memset
> Instead of calling memset:
>
> ffffffff8100cd8d: e8 0e 15 7a 00 callq ffffffff817ae2a0 <__memset>
>
> and having a JMP inside it depending on the feature supported, let's simply
> have the REP; STOSB directly in the code:
>
> ...
> ffffffff81000442: 4c 89 d7 mov %r10,%rdi
> ffffffff81000445: b9 00 10 00 00 mov $0x1000,%ecx
>
> <---- new memset
> ffffffff8100044a: f3 aa rep stos %al,%es:(%rdi)
> ffffffff8100044c: 90 nop
> ffffffff8100044d: 90 nop
> ffffffff8100044e: 90 nop
You can fit entire "xor eax, eax; rep stosb" inside call instruction.
> /* clobbers used by memset_orig() and memset_rep_good() */
> : "rsi", "rdx", "r8", "r9", "memory");
eh... I'd just drop it. These registers screw up everything.
Time to rebase memset0().
Powered by blists - more mailing lists