[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250513070918.GB25763@noisy.programming.kicks-ass.net>
Date: Tue, 13 May 2025 09:09:18 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: dan.j.williams@...el.com, linux-cxl@...r.kernel.org,
linux-kernel@...r.kernel.org, David Lechner <dlechner@...libre.com>,
Ingo Molnar <mingo@...nel.org>,
"Fabio M. De Francesco" <fabio.maria.de.francesco@...ux.intel.com>,
Davidlohr Bueso <dave@...olabs.net>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Dave Jiang <dave.jiang@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Ira Weiny <ira.weiny@...el.com>
Subject: Re: [PATCH 1/7] cleanup: Introduce DEFINE_ACQUIRE() a CLASS() for
conditional locking
On Mon, May 12, 2025 at 01:39:19PM -0700, Linus Torvalds wrote:
> On Mon, 12 May 2025 at 11:58, Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > > GCC is 'stupid' and this generates atrocious code. I'll play with it.
> >
> > PRE:
> > bf9e: 48 85 db test %rbx,%rbx
> > bfa1: 74 1a je bfbd <foo+0x5d>
> > bfa3: 48 81 fb 00 f0 ff ff cmp $0xfffffffffffff000,%rbx
> > bfaa: 77 11 ja bfbd <foo+0x5d>
> >
> > POST:
> > bf9e: 48 8d 43 ff lea -0x1(%rbx),%rax
> > bfa2: 48 3d ff ef ff ff cmp $0xffffffffffffefff,%rax
> > bfa8: 77 11 ja bfbb <foo+0x5b>
>
> I'm not convinced that's actually an improvement.
>
> Yes, it's one less instruction, and three bytes shorter. But it uses
> an extra register, so now it might make surrounding code much worse by
> making register allocation have a harder time.
I was going for the one less branch, but yeah, register pressure :/
Typically this is at the end of a scope, and I was hoping this is where
you have free regs etc.
> If you *really* care about this, I think you should realize that the
> non-error case is a valid kernel pointer.
>
> And we could add some architecture-specific function to check for "is
> this a valid non-NULL and non-error pointer" with a fallback to the
> generic case.
>
> Because then on a platform like x86, where kernel pointers are always
> negative, but not *as* negative as the error pointers, you can check
> for that with a single compare.
>
> The logic is "add MAX_ERRNO, and if it's still negative, it wasn't
> NULL and it wasn't ERR_PTR".
>
> And while 'add' needs a destination register, 'sub' with the negated
> value does not, and is called 'cmp'.
>
> So I think you can do that with
>
> cmp $-MAX_ERRNO,...
> js ...
>
> Sadly, I can't seem to get gcc to generate that code. But I didn't try
> very hard.
And so try I must :-)
Powered by blists - more mailing lists