lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zl3qCajhEbC9pNAm@arm.com>
Date: Mon, 3 Jun 2024 17:06:33 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Yang Shi <yang@...amperecomputing.com>
Cc: "Christoph Lameter (Ampere)" <cl@...ux.com>, will@...nel.org,
	anshuman.khandual@....com, scott@...amperecomputing.com,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [v2 PATCH] arm64: mm: force write fault for atomic RMW
 instructions

On Thu, May 23, 2024 at 03:13:23PM -0700, Yang Shi wrote:
> On 5/23/24 2:34 PM, Catalin Marinas wrote:
> > On Thu, May 23, 2024 at 12:43:34PM -0700, Christoph Lameter (Ampere) wrote:
> > > On Thu, 23 May 2024, Catalin Marinas wrote:
> > > > > > While this class includes all atomics that currently require write
> > > > > > permission, there's some unallocated space in this range and we don't
> > > > > > know what future architecture versions may introduce. Unfortunately we
> > > > > > need to check each individual atomic op in this class (not sure what the
> > > > > > overhead will be).
> > > > > 
> > > > > Can you tell us which bits or pattern is not allocated? Maybe we can exclude
> > > > > that from the pattern.
> > > > 
> > > > Yes, it may be easier to exclude those patterns. See the Arm ARM K.a
> > > > section C4.1.94.29 (page 791).
> > > 
> > > Hmmm. We could consult an exception table once the pattern matches to reduce
> > > the overhead.
> > 
> > Yeah, check the atomic class first and then go into the finer-grained
> > details. I think this would reduce the overhead for non-atomic
> > instructions.
> 
> If I read the instruction encoding correctly, the unallocated instructions
> are decided by the below fields:
> 
>   - size
>   - VAR
>   - o3
>   - opc
> 
> To exclude them I think we can do something like:
> 
> if atomic instructions {
>     if V == 1
>         return false;
>     if o3 opc == 111x
>         return false;
>     switch VAR {
>         000
>             check o3 and opc
>         001
>             check 03 and opc
>         010
>             check o3 and opc
>         011
>             check o3 and opc
>         default
>             if size != 11
>                 check o3 and opc
>     }
> }
> 
> So it may take 4 + the possible unallocated combos of o3 and opc branches
> for the worst case. I saw 5 different combos for o3 and opc, so 9 branches
> for worst cases.

Or we have a sorted table of exclusions and do a binary search. Not sure
which one is faster.

> But if they will be allocated to non-atomic instructions, we have to do
> fine-grained decoding, but it may be easier since we can just filter out
> those non-atomic instructions? Anyway it depends on how they will be used.
> Hopefully this won't happen.

Actually, the atomics table has LD64B and LDAPR already which are load
instructions, no write permission needed. So we need to exclude these
and all the unallocated space in this range.

-- 
Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ