lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAFULd4ZzoW+vP_pa1hEF--gvsG8yaPLU8S7oBkJBZLP4Tirepw@mail.gmail.com>
Date: Sat, 29 Mar 2025 09:48:14 +0100
From: Uros Bizjak <ubizjak@...il.com>
To: Ingo Molnar <mingo@...nel.org>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, 
	Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH 2/2] x86/bitops: Fix false output register dependency of
 TZCNT insn

On Fri, Mar 28, 2025 at 11:28 PM Ingo Molnar <mingo@...nel.org> wrote:
>
>
> * Uros Bizjak <ubizjak@...il.com> wrote:
>
> > On Tue, Mar 25, 2025 at 10:43 PM Ingo Molnar <mingo@...nel.org> wrote:
> > >
> > >
> > > * Uros Bizjak <ubizjak@...il.com> wrote:
> > >
> > > > On Haswell and later Intel processors, the TZCNT instruction appears
> > > > to have a false dependency on the destination register. Even though
> > > > the instruction only writes to it, the instruction will wait until
> > > > destination is ready before executing. This false dependency
> > > > was fixed for Skylake (and later) processors.
> > > >
> > > > Fix false dependency by clearing the destination register first.
> > > >
> > > > The x86_64 defconfig object size increases by 4215 bytes:
> > > >
> > > >           text           data     bss      dec            hex filename
> > > >       27342396        4642999  814852 32800247        1f47df7 vmlinux-old.o
> > > >       27346611        4643015  814852 32804478        1f48e7e vmlinux-new.o
> > >
> > > Yeah, so Skylake was released in 2015, about a decade ago.
> > >
> > > So we'd be making the kernel larger for an unquantified
> > > micro-optimization for CPUs that almost nobody uses anymore.
> > > That's a bad trade-off.
> >
> > Yes, 4.2k seems a bit excessive. OTOH, I'd not say that the issue is
> > a micro-optimization, it is bordering on the hardware bug.
>
> Has this been quantified, and do we really care about the
> micro-performance of ~10-year old CPUs, especially at the
> expense of modern CPUs?

No, although the change would be a one liner now. Without specially
crafted benchmark loops the impact is not noticeable and typical
kernel usage of these instructions is not that sensitive on
destination.

Thanks,
Uros.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ