[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d18611c7-9108-46f7-a5a5-6c8e0069de9b@os.amperecomputing.com>
Date: Thu, 23 May 2024 15:13:23 -0700
From: Yang Shi <yang@...amperecomputing.com>
To: Catalin Marinas <catalin.marinas@....com>,
"Christoph Lameter (Ampere)" <cl@...ux.com>
Cc: will@...nel.org, anshuman.khandual@....com, scott@...amperecomputing.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [v2 PATCH] arm64: mm: force write fault for atomic RMW
instructions
On 5/23/24 2:34 PM, Catalin Marinas wrote:
> On Thu, May 23, 2024 at 12:43:34PM -0700, Christoph Lameter (Ampere) wrote:
>> On Thu, 23 May 2024, Catalin Marinas wrote:
>>>>> While this class includes all atomics that currently require write
>>>>> permission, there's some unallocated space in this range and we don't
>>>>> know what future architecture versions may introduce. Unfortunately we
>>>>> need to check each individual atomic op in this class (not sure what the
>>>>> overhead will be).
>>>> Can you tell us which bits or pattern is not allocated? Maybe we can exclude
>>>> that from the pattern.
>>> Yes, it may be easier to exclude those patterns. See the Arm ARM K.a
>>> section C4.1.94.29 (page 791).
>> Hmmm. We could consult an exception table once the pattern matches to reduce
>> the overhead.
> Yeah, check the atomic class first and then go into the finer-grained
> details. I think this would reduce the overhead for non-atomic
> instructions.
If I read the instruction encoding correctly, the unallocated
instructions are decided by the below fields:
- size
- VAR
- o3
- opc
To exclude them I think we can do something like:
if atomic instructions {
if V == 1
return false;
if o3 opc == 111x
return false;
switch VAR {
000
check o3 and opc
001
check 03 and opc
010
check o3 and opc
011
check o3 and opc
default
if size != 11
check o3 and opc
}
}
So it may take 4 + the possible unallocated combos of o3 and opc
branches for the worst case. I saw 5 different combos for o3 and opc, so
9 branches for worst cases.
>
>> However, the harm done I think is acceptable even if we leave things as is.
>> In the worst case we create unnecesssary write fault processing for an
>> "atomic op" that does not need write access. Also: Why would it need to be
>> atomic if it does not write???
> I'm thinking of some conditional instruction that states no write if
> condition fails. But it could be even worse if the architects decide to
> reuse that unallocated space for some instructions that have nothing to
> do with the atomic accesses.
Even though the condition fails, forcing write fault still seems fine
IIUC. I'm supposed the read fault will happen regardless of the
condition. Then a page with all 0 content is installed. This is
guaranteed. We just end up having write permission instead of read-only
permission. We will also be in this state transiently with current
supported atomic instructions.
But if they will be allocated to non-atomic instructions, we have to do
fine-grained decoding, but it may be easier since we can just filter out
those non-atomic instructions? Anyway it depends on how they will be
used. Hopefully this won't happen.
>
> It's something we need to clarify with them but I'm about to go on
> holiday for a week, so I won't be able to check.
Have a good holiday.
>
>> The ultimate solution would be to change the spec so that arm processors can
>> skip useless read faults.
> I raised this already, waiting for feedback from the architects.
Thank you so much.
>
Powered by blists - more mailing lists