[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8E8C8B78-2D92-4D34-BA89-909F7F2FEA55@zytor.com>
Date: Mon, 28 Apr 2025 06:41:09 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
CC: Andrew Cooper <andrew.cooper3@...rix.com>, Arnd Bergmann <arnd@...db.de>,
Arnd Bergmann <arnd@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
Juergen Gross <jgross@...e.com>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Alexander Usyskin <alexander.usyskin@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Mateusz Jończyk <mat.jonczyk@...pl>,
Mike Rapoport <rppt@...nel.org>, Ard Biesheuvel <ardb@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org,
xen-devel@...ts.xenproject.org
Subject: Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case handling to C
On April 28, 2025 12:14:40 AM PDT, Ingo Molnar <mingo@...nel.org> wrote:
>
>* Ingo Molnar <mingo@...nel.org> wrote:
>
>> And once we remove 486, I think we can do the optimization below to
>> just assume the output doesn't get clobbered by BS*L in the
>> zero-case, right?
>>
>> In the text size space it's a substantial optimization on x86-32
>> defconfig:
>>
>> text data bss dec hex filename
>> 16,577,728 7598826 1744896 25921450 18b87aa vmlinux.vanilla # CMOV+BS*L
>> 16,577,908 7598838 1744896 25921642 18b886a vmlinux.linus_patch # if()+BS*L
>> 16,573,568 7602922 1744896 25921386 18b876a vmlinux.noclobber # BS*L
>
>And BTW, *that* is a price that all of non-486 x86-32 was paying for
>486 support...
>
>And, just out of intellectual curiosity, I also tried to measure the
>code generation price of the +1 standards-quirk in the fls()/ffs()
>interface as well:
>
> text data bss dec hex filename
> 16,577,728 7598826 1744896 25921450 18b87aa vmlinux.vanilla # CMOV+BS*L
> 16,577,908 7598838 1744896 25921642 18b886a vmlinux.linus_patch # if()+BS*L
> 16,573,568 7602922 1744896 25921386 18b876a vmlinux.noclobber # BS*L
> ..........
> 16,573,552 7602922 1744896 25921370 18b875a vmlinux.broken # BROKEN: 0 baseline instead of 1
>
>... and unless I messed up the patch, it seems to have a surprisingly
>low impact - maybe because the compiler can amortize its cost by
>adjusting all dependent code mostly at build time, so the +1 doesn't
>end up being generated most of the time?
>
>Thanks,
>
> Ingo
>
>===============================>
>
>This broken patch is broken: it intentionally breaks the ffs()/fls()
>interface in an attempt to measure the code generation effects of
>interface details.
>
>NOT-Signed-off-by: <anyone@...where.anytime>
>---
> arch/x86/include/asm/bitops.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
>index e3e94a806656..21707696bafe 100644
>--- a/arch/x86/include/asm/bitops.h
>+++ b/arch/x86/include/asm/bitops.h
>@@ -318,7 +318,7 @@ static __always_inline int variable_ffs(int x)
> : "=r" (r)
> : ASM_INPUT_RM (x), "0" (-1));
>
>- return r + 1;
>+ return r;
> }
>
> /**
>@@ -362,7 +362,7 @@ static __always_inline int fls(unsigned int x)
> : "=r" (r)
> : ASM_INPUT_RM (x), "0" (-1));
>
>- return r + 1;
>+ return r;
> }
>
> /**
My recollection was that you can't assume that even for 586; that it is only safe for 686, but it has been a long time...
Powered by blists - more mailing lists