[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <83fb5685-a206-477c-bff3-03e0ebf4c40c@csgroup.eu>
Date: Thu, 26 Jun 2025 07:56:10 +0200
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: David Laight <david.laight.linux@...il.com>,
Segher Boessenkool <segher@...nel.crashing.org>
Cc: Michael Ellerman <mpe@...erman.id.au>, Nicholas Piggin
<npiggin@...il.com>, Naveen N Rao <naveen@...nel.org>,
Madhavan Srinivasan <maddy@...ux.ibm.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>, Andre Almeida <andrealmeid@...lia.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 0/5] powerpc: Implement masked user access
Le 24/06/2025 à 23:08, David Laight a écrit :
> On Tue, 24 Jun 2025 13:25:05 -0500
> Segher Boessenkool <segher@...nel.crashing.org> wrote:
>
>>>> isel (which is base PowerPC, not something "e500" only) is a
>>>> computational instruction, it copies one of two registers to a third,
>>>> which of the two is decided by any bit in the condition register.
>>>
>>> Does that mean it could be used for all the ppc cpu variants?
>>
>> No, only things that implement architecture version of 2.03 or later.
>> That is from 2006, so essentially everything that is still made
>> implements it :-)
>>
>> But ancient things do not. Both 970 (Apple G5) and Cell BE do not yet
>> have it (they are ISA 2.01 and 2.02 respectively). And the older p5's
>> do not have it yet either, but the newer ones do.
For book3s64, GCC only use isel with -mcpu=power9 or -mcpu=power10
>>
>> And all classic PowerPC is ISA 1.xx of course. Medieval CPUs :-)
>
> That make more sense than the list in patch 5/5.
Sorry for the ambiguity. In patch 5/5 I was addressing only powerpc/32,
and as far as I know the only powerpc/32 supported by Linux that has
isel is the 85xx which has an e500 core.
For powerpc/64 we have less constraint than on powerpc32:
- Kernel memory starts at 0xc000000000000000
- User memory stops at 0x0010000000000000
So we can easily use 0x8000000000000000 as demarcation line, which
allows a 3 instructions sequence:
sradi r9,r3,63 => shirt right algebric: r9 = 0 when r3 >= 0; r9 = -1
when r3 < 0
andc r3,r3,r9 => and with complement: r3 unchanged when r9 = 0, r3 =
0 when r9 = -1
rldimi r3,r9,0,1 => Rotate left and mask insert: Insert highest bit of
r9 into r3
Whereas using isel requires a 4 instructions sequence:
li r9, 1 => load immediate: r9 = 1
rotldi r9, r9, 63 => rotate left: r9 = 0x8000000000000000
cmpld r3, r9 => compare logically
iselgt r3, r9, r3 => integer select greater than: select r3 when result
of above compare is <= ; select r9 when result of compare is >
>
>>
>>>> But sure, seen from very far off both isel and cmove can be used to
>>>> implement the ternary operator ("?:"), are similar in that way :-)
>>>
>>> Which is exactly what you want to avoid speculation.
>>
>> There are cheaper / simpler / more effective / better ways to get that,
>> but sure, everything is better than a conditional branch, always :-)
>
> Everything except a TLB miss :-)
>
> And for access_ok() avoiding the conditional is a good enough reason
> to use a 'conditional move' instruction.
> Avoiding speculation is actually free.
>
And on CPUs that are not affected by Spectre and Meltdown like powerpc
8xx or powerpc 603, masking the pointer is still worth it as it
generates better code, even if the masking involves branching.
That's the reason why I did:
- e500 (affected by Spectre v1) ==> Use isel ==> 3 instructions sequence:
lis r9, TASK_SIZE@h => load immediate shifted => r9 = (TASK_SIZE >> 16)
<< 16
cmplw r3, r9 => compare addr with TASK_SIZE
iselgt r3, r9, r3 => addr = TASK_SIZE when addr > TASK_SIZE; addr =
addr when <= TASK_SIZE
For others:
- If TASK_SIZE <= 0x80000000 and kernel >= 0x80000000 ==> 3 instructions
sequence similar to the 64 bits one above:
srawi r9,r3,63
andc r3,r3,r9
rlwimi r3,r9,0,0,0
- Otherwise, if speculation mitigation is required (e600), use the
6-instructions masking sequence (r3 contains the address to mask)
srwi r9,r3,17 => shift right: shift r3 by 17 to keep 15 bits (positive
16 bits)
subfic r9,r9,22527 => sub from immediate: r9 = (TASK_SIZE >> 17) - r9
=> r9 < 0 when r3 is greater
srawi r9,r9,31 => shift right algebric: r9 = 0 when r9 >= 0; r9 = -1
when r9 < 0
andis. r10,r9,45056 => and immediat shifted: r10 = r9 and upper part of
TASK_SIZE => r10 = TASK_SIZE when r3 > TASK_SIZE, r10 = 0 otherwise
andc r3,r3,r9 => and with complement: r3 is unchanged when r9 = 0 so
when r3 <= TASK_SIZE, r3 is cleared when r9 = -1 so when r3 > TASK_SIZE
or r3,r3,r10 => r3 = r3 | r10
- If no speculation mitigation is required, let GCC do as it wants
List of affected CPUs is available here:
https://docs.nxp.com/bundle/GUID-805CC0EA-4001-47AD-86CD-4F340751F6B7/page/GUID-17B5D04F-6471-4EC6-BEB9-DE4D0AFA034A.html
Powered by blists - more mailing lists