lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZokSpUr16xmfY8Z6@arm.com>
Date: Sat, 6 Jul 2024 10:47:17 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: "Christoph Lameter (Ampere)" <cl@...two.org>
Cc: Yang Shi <yang@...amperecomputing.com>, will@...nel.org,
	anshuman.khandual@....com, david@...hat.com,
	scott@...amperecomputing.com, linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org
Subject: Re: [v5 PATCH] arm64: mm: force write fault for atomic RMW
 instructions

On Fri, Jul 05, 2024 at 11:51:33AM -0700, Christoph Lameter (Ampere) wrote:
> On Fri, 5 Jul 2024, Catalin Marinas wrote:
> > People will be happy until one enables execute-only ELF text sections in
> > a distro and all that opcode parsing will add considerable overhead for
> > many read faults (those with a writeable vma).
> 
> The opcode is in the l1 cache since we just faulted on it. There is no
> "considerable" overhead.

This has nothing to do with caches (and many Arm implementations still
have separate I and D caches). With the right linker flags (e.g.
--execute-only for lld), one can generate a PROT_EXEC only (no
PROT_READ) ELF text section. On newer Arm CPUs with FEAT_EPAN, the
kernel no longer forces PROT_READ on PROT_EXEC only mappings. The
get_user() in this patch to read the opcode will fault. So instead of
two faults you get now for an atomic instruction, you'd get three (the
additional one for opcode reading). What's worse, it affects standard
loads as well in the same way.

Yang Shi did test this scenario but for correctness only, not
performance. It would be good to recompile the benchmark with
--execute-only (or --rosegment I think for gnu ld) and see post the
results.

> > Just to be clear, there are still potential issues to address (or
> > understand the impact of) in this patch with exec-only mappings and
> > the performance gain _after_ the THP behaviour changed in the mm code.
> > We can make a call once we have more data but, TBH, my inclination is
> > towards 'no' given that OpenJDK already support madvise() and it's not
> > arm64 specific.
> 
> It is arm64 specific. Other Linux architectures have optimizations for
> similar issues in their arch code as mentioned in the patch or the
> processors will not double fault.
> 
> Is there a particular reason for ARM as processor manufacturer to oppose
> this patch? We have mostly hand waving and speculations coming from you
> here.

Arm Ltd has no involvement at all in this decision (and probably if you
ask the architects, they wouldn't see any problem). Even though I have
an arm.com email address, my opinions on the list are purely from a
kernel maintainer perspective.

There's no speculation but some real concerns here. Please see above.

> What the patch does is clearly beneficial and it is an established way of
> implementing read->write fault handling.

It is clearly beneficial for this specific case but, as I said, we still
need to address the execute-only mappings causing an additional fault on
the opcode reading. You may not find many such binaries now in the field
but there's a strong push from security people to enable it (it's a
user-space decisions, the kernel simply provides PROT_EXEC only
mappings).

In addition, there's a 24% performance overhead in one of Yang Shi's
results. This has not been explained.

-- 
Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ