[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220804180805.9077-1-knscarlet@gnuweeb.org>
Date: Thu, 4 Aug 2022 18:08:05 +0000
From: Kanna Scarlet <knscarlet@...weeb.org>
To: Borislav Petkov <bp@...en8.de>
Cc: Kanna Scarlet <knscarlet@...weeb.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Ard Biesheuvel <ardb@...nel.org>,
Bill Metzenthen <billm@...bpc.org.au>,
Brijesh Singh <brijesh.singh@....com>,
Joerg Roedel <jroedel@...e.de>,
Josh Poimboeuf <jpoimboe@...nel.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Mark Rutland <mark.rutland@....com>,
Michael Roth <michael.roth@....com>,
Peter Zijlstra <peterz@...radead.org>,
Sean Christopherson <seanjc@...gle.com>,
Steven Rostedt <rostedt@...dmis.org>,
Ammar Faizi <ammarfaizi2@...weeb.org>,
GNU/Weeb Mailing List <gwml@...r.gnuweeb.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/1] x86: Change mov $0, %reg with xor %reg, %reg
On 8/4/22 10:53 PM, Borislav Petkov wrote:
> Bonus points if you find out what other advantage
>
> XOR reg,reg
>
> has when it comes to clearing integer registers.
Hello sir Borislav,
Thank you for your response. I tried to find out other advantages of
xor reg,reg on Google and found this:
https://stackoverflow.com/a/33668295/7275114
"xor (being a recognized zeroing idiom, unlike mov reg, 0) has some
obvious and some subtle advantages:
1. smaller code-size than mov reg,0. (All CPUs)
2. avoids partial-register penalties for later code.
(Intel P6-family and SnB-family).
3. doesn't use an execution unit, saving power and freeing up
execution resources. (Intel SnB-family)
4. smaller uop (no immediate data) leaves room in the uop cache-line
for nearby instructions to borrow if needed. (Intel SnB-family).
5. doesn't use up entries in the physical register file. (Intel
SnB-family (and P4) at least, possibly AMD as well since they use
a similar PRF design instead of keeping register state in the ROB
like Intel P6-family microarchitectures.)"
Should I add all in the explanation sir? I will send v2 revision
tomorrow.
We also find more files to patch with this command:
grep -rE "mov.?\s+\\$\\0\s*," arch/x86
it shows many immediate zero moves to 64-bit register in file
arch/x86/crypto/curve25519-x86_64.c, but the next instruction may depend
on the previous %rflags value, we are afraid to change this because
xor touches %rflags. We will try to change it to movl $0, %r32 to
reduce the code size.
Example cmovc needs %rflags
" adcx %1, %%r11;"
" movq %%r11, 24(%2);"
/* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */
" mov $0, %%rax;"
" cmovc %%rdx, %%rax;"
Thanks.
Regards,
--
Kanna Scarlet
Powered by blists - more mailing lists