[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250624093258.4906c0e0@pumpkin>
Date: Tue, 24 Jun 2025 09:32:58 +0100
From: David Laight <david.laight.linux@...il.com>
To: Christophe Leroy <christophe.leroy@...roup.eu>
Cc: Michael Ellerman <mpe@...erman.id.au>, Nicholas Piggin
<npiggin@...il.com>, Naveen N Rao <naveen@...nel.org>, Madhavan Srinivasan
<maddy@...ux.ibm.com>, Alexander Viro <viro@...iv.linux.org.uk>, Christian
Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>, Thomas Gleixner
<tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Peter Zijlstra
<peterz@...radead.org>, Darren Hart <dvhart@...radead.org>, Davidlohr Bueso
<dave@...olabs.net>, Andre Almeida <andrealmeid@...lia.com>, Andrew Morton
<akpm@...ux-foundation.org>, Dave Hansen <dave.hansen@...ux.intel.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 0/5] powerpc: Implement masked user access
On Tue, 24 Jun 2025 07:27:47 +0200
Christophe Leroy <christophe.leroy@...roup.eu> wrote:
> Le 22/06/2025 à 18:20, David Laight a écrit :
> > On Sun, 22 Jun 2025 11:52:38 +0200
> > Christophe Leroy <christophe.leroy@...roup.eu> wrote:
> >
> >> Masked user access avoids the address/size verification by access_ok().
> >> Allthough its main purpose is to skip the speculation in the
> >> verification of user address and size hence avoid the need of spec
> >> mitigation, it also has the advantage to reduce the amount of
> >> instructions needed so it also benefits to platforms that don't
> >> need speculation mitigation, especially when the size of the copy is
> >> not know at build time.
> >
> > It also removes a conditional branch that is quite likely to be
> > statically predicted 'the wrong way'.
>
> But include/asm-generic/access_ok.h defines access_ok() as:
>
> #define access_ok(addr, size) likely(__access_ok(addr, size))
>
> So GCC uses the 'unlikely' variant of the branch instruction to force
> the correct prediction, doesn't it ?
Nope...
Most architectures don't have likely/unlikely variants of branches.
So all gcc can do is decide which path is the fall-through and
whether the branch is forwards or backwards.
Additionally unless there is code in both the 'if' and 'else' clauses
the [un]likely seems to have no effect.
So on simple cpu that predict 'backwards branches taken' you can get
the desired effect - but it may need an 'asm comment' to force the
compiler to generate the required branches (eg forwards branch directly
to a backwards unconditional jump).
On x86 it is all more complicated.
I think the pre-fetch code is likely to assume 'not taken' (but might
use stale info on the cache line).
The predictor itself never does 'static prediction' - it is always
based on the referenced branch prediction data structure.
So, unless you are in a loop (eg running a benchmark!) there is pretty
much a 50% chance of a branch mispredict.
I've been trying to benchmark different versions of the u64 * u64 / u64
function - and I think mispredicted branches make a big difference.
I need to sit down and sequence the test cases so that I can see
the effect of each branch!
>
> >
> >> Unlike x86_64 which masks the address to 'all bits set' when the
> >> user address is invalid, here the address is set to an address in
> >> the gap. It avoids relying on the zero page to catch offseted
> >> accesses. On book3s/32 it makes sure the opening remains on user
> >> segment. The overcost is a single instruction in the masking.
> >
> > That isn't true (any more).
> > Linus changed the check to (approx):
> > if (uaddr > TASK_SIZE)
> > uaddr = TASK_SIZE;
> > (Implemented with a conditional move)
>
> Ah ok, I overlooked that, I didn't know the cmove instruction, seem
> similar to the isel instruction on powerpc e500.
It got added for the 386 - I learnt 8086 :-)
I suspect x86 got there first...
Although called 'conditional move' I very much suspect the write is
actually unconditional.
So the hardware implementation is much the same as 'add carry' except
the ALU operation is a simple multiplex.
Which means it is unlikely to be speculative.
David
>
> Christophe
>
Powered by blists - more mailing lists