[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whxPoFnZ4cLKh4X3m4qVcaak__G8+0iG-aOGO7YkS3LdA@mail.gmail.com>
Date: Mon, 12 May 2025 13:39:19 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: dan.j.williams@...el.com, linux-cxl@...r.kernel.org,
linux-kernel@...r.kernel.org, David Lechner <dlechner@...libre.com>,
Ingo Molnar <mingo@...nel.org>,
"Fabio M. De Francesco" <fabio.maria.de.francesco@...ux.intel.com>,
Davidlohr Bueso <dave@...olabs.net>, Jonathan Cameron <jonathan.cameron@...wei.com>,
Dave Jiang <dave.jiang@...el.com>, Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>
Subject: Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for
conditional locking
On Mon, 12 May 2025 at 11:58, Peter Zijlstra <peterz@...radead.org> wrote:
>
> > GCC is 'stupid' and this generates atrocious code. I'll play with it.
>
> PRE:
> bf9e: 48 85 db test %rbx,%rbx
> bfa1: 74 1a je bfbd <foo+0x5d>
> bfa3: 48 81 fb 00 f0 ff ff cmp $0xfffffffffffff000,%rbx
> bfaa: 77 11 ja bfbd <foo+0x5d>
>
> POST:
> bf9e: 48 8d 43 ff lea -0x1(%rbx),%rax
> bfa2: 48 3d ff ef ff ff cmp $0xffffffffffffefff,%rax
> bfa8: 77 11 ja bfbb <foo+0x5b>
I'm not convinced that's actually an improvement.
Yes, it's one less instruction, and three bytes shorter. But it uses
an extra register, so now it might make surrounding code much worse by
making register allocation have a harder time.
If you *really* care about this, I think you should realize that the
non-error case is a valid kernel pointer.
And we could add some architecture-specific function to check for "is
this a valid non-NULL and non-error pointer" with a fallback to the
generic case.
Because then on a platform like x86, where kernel pointers are always
negative, but not *as* negative as the error pointers, you can check
for that with a single compare.
The logic is "add MAX_ERRNO, and if it's still negative, it wasn't
NULL and it wasn't ERR_PTR".
And while 'add' needs a destination register, 'sub' with the negated
value does not, and is called 'cmp'.
So I think you can do that with
cmp $-MAX_ERRNO,...
js ...
Sadly, I can't seem to get gcc to generate that code. But I didn't try
very hard.
Linus
Powered by blists - more mailing lists