linux-kernel - Re: [PATCH] arm64: mm: force write fault for atomic RMW instructions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c1ba9ba3-b0d6-4c6c-d628-614751d737c2@gentwo.org>
Date: Wed, 8 May 2024 10:15:28 -0700 (PDT)
From: "Christoph Lameter (Ampere)" <cl@...two.org>
To: Anshuman Khandual <anshuman.khandual@....com>
cc: Yang Shi <yang@...amperecomputing.com>, catalin.marinas@....com, 
    will@...nel.org, scott@...amperecomputing.com, 
    linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: mm: force write fault for atomic RMW
 instructions

On Wed, 8 May 2024, Anshuman Khandual wrote:

>> The atomic RMW instructions, for example, ldadd, actually does load +
>> add + store in one instruction, it may trigger two page faults, the
>> first fault is a read fault, the second fault is a write fault.
>
> It may or it will definitely create two consecutive page faults. What
> if the second write fault never came about. In that case an writable
> page table entry would be created unnecessarily (or even wrongfully),
> thus breaking the CoW.

An atomic RMV will always perform a write? If there is a read fault 
then write fault will follow.

>> Some applications use atomic RMW instructions to populate memory, for
>> example, openjdk uses atomic-add-0 to do pretouch (populate heap memory
>
> But why cannot normal store operation is sufficient for pre-touching
> the heap memory, why read-modify-write (RMW) is required instead ?

Sure a regular write operation is sufficient but you would have to modify 
existing applications to get that done. x86 does not do a read fault on 
atomics so we have an issue htere.

> If the memory address has some valid data, it must have already reached there
> via a previous write access, which would have caused initial CoW transition ?
> If the memory address has no valid data to begin with, why even use RMW ?

Because the application can reasonably assume that all uninitialized data 
is zero and therefore it is not necessary to have a prior write access.

>> Some other architectures also have code inspection in page fault path,
>> for example, SPARC and x86.
>
> Okay, I was about to ask, but is not calling get_user() for all data
> read page faults increase the cost for a hot code path in general for
> some potential savings for a very specific use case. Not sure if that
> is worth the trade-off.

The instruction is cache hot since it must be present in the cpu cache for 
the fault. So the overhead is minimal.