linux-kernel - Re: [PATCH -tip] x86/locking/atomic: Use asm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAFULd4YNvz2rJEJDjacCeWak-JZNUfMB5LuM+qAwn_DCcn-CUg@mail.gmail.com>
Date: Wed, 5 Mar 2025 20:47:46 +0100
From: Uros Bizjak <ubizjak@...il.com>
To: Linus Torvalds <torvalds@...uxfoundation.org>
Cc: Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...el.com>, x86@...nel.org, 
	linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>, 
	Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...nel.org>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic
 locking insns

On Wed, Mar 5, 2025 at 6:04 PM Linus Torvalds
<torvalds@...uxfoundation.org> wrote:
>
> On Tue, 4 Mar 2025 at 22:54, Uros Bizjak <ubizjak@...il.com> wrote:
> >
> > Even to my surprise, the patch has some noticeable effects on the
> > performance, please see the attachment in [1] for LMBench data or [2]
> > for some excerpts from the data. So, I think the patch has potential
> > to improve the performance.
>
> I suspect some of the performance difference - which looks
> unexpectedly large - is due to having run them on a CPU with the
> horrendous indirect return costs, and then inlining can make a huge
> difference.
> kvm
> Regardless, I absolutely think that using asm_inline here is the right
> thing for the locked instructions.

It is "Intel(R) Core(TM) i7-10710U"

> That said, I do want to bring up another issue: maybe it's time to
> just retire the LOCK_PREFIX thing entirely?
>
> It harkens back to Ye Olde Days when UP was the norm, and we didn't
> want to pay the cost of lock prefixes when the kernel was built for
> SMP but was run on an UP machine.
>
> And honestly, none of that makes sense any more. You can't buy a UP
> machine any more, and the only UP case would be some silly minimal
> virtual environment, and if people really care about that minimal
> case, they should just compile the kernel without SMP support.
> Becxause UP has gone from being the default to being irrelevant. At
> least for x86-64.
>
> So I think we should just get rid of LOCK_PREFIX_HERE and the
> smp_locks section entirely.

Please note that this functionality is shared with i386 target, so the
removal, proposed above, would somehow penalize 32bit targets. The
situation w.r.t. UP vs SMP is not that clear there, maybe some distro
still provides i386 SMP kernels that would then run unoptimized on UP
systems.

>From the compiler POV, now that "lock; " prefix lost its semicolon,
removing LOCK_PREFIX_HERE or using asm_inline would result in exactly
the same code. The problematic 31k code size increase (1.1%) with -O2
is inevitable either way, if we want to move forward.

My proposal would be to use what a modern compiler offers. By using
asm_inline, we can keep the status quo (mostly for i386) for some more
time, and still move forward. And we get that -Os code size *decrease*
as a bonus for those that want to shave the last byte from the kernel.

Thanks,
Uros.