lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whxPoFnZ4cLKh4X3m4qVcaak__G8+0iG-aOGO7YkS3LdA@mail.gmail.com>
Date: Mon, 12 May 2025 13:39:19 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: dan.j.williams@...el.com, linux-cxl@...r.kernel.org, 
	linux-kernel@...r.kernel.org, David Lechner <dlechner@...libre.com>, 
	Ingo Molnar <mingo@...nel.org>, 
	"Fabio M. De Francesco" <fabio.maria.de.francesco@...ux.intel.com>, 
	Davidlohr Bueso <dave@...olabs.net>, Jonathan Cameron <jonathan.cameron@...wei.com>, 
	Dave Jiang <dave.jiang@...el.com>, Alison Schofield <alison.schofield@...el.com>, 
	Vishal Verma <vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>
Subject: Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for
 conditional locking

On Mon, 12 May 2025 at 11:58, Peter Zijlstra <peterz@...radead.org> wrote:
>
> > GCC is 'stupid' and this generates atrocious code. I'll play with it.
>
> PRE:
>     bf9e:       48 85 db                test   %rbx,%rbx
>     bfa1:       74 1a                   je     bfbd <foo+0x5d>
>     bfa3:       48 81 fb 00 f0 ff ff    cmp    $0xfffffffffffff000,%rbx
>     bfaa:       77 11                   ja     bfbd <foo+0x5d>
>
> POST:
>     bf9e:       48 8d 43 ff             lea    -0x1(%rbx),%rax
>     bfa2:       48 3d ff ef ff ff       cmp    $0xffffffffffffefff,%rax
>     bfa8:       77 11                   ja     bfbb <foo+0x5b>

I'm not convinced that's actually an improvement.

Yes, it's one less instruction, and three bytes shorter. But it uses
an extra register, so now it might make surrounding code much worse by
making register allocation have a harder time.

If you *really* care about this, I think you should realize that the
non-error case is a valid kernel pointer.

And we could add some architecture-specific function to check for "is
this a valid non-NULL and non-error pointer" with a fallback to the
generic case.

Because then on a platform like x86, where kernel pointers are always
negative, but not *as* negative as the error pointers, you can check
for that with a single compare.

The logic is "add MAX_ERRNO, and if it's still negative, it wasn't
NULL and it wasn't ERR_PTR".

And while 'add' needs a destination register, 'sub' with the negated
value does not, and is called 'cmp'.

So I think you can do that with

        cmp $-MAX_ERRNO,...
        js ...

Sadly, I can't seem to get gcc to generate that code. But I didn't try
very hard.

              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ