linux-kernel - Re: [v5 PATCH] arm64: mm: force write fault for atomic RMW instructions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e8497b6b-c85a-4b11-9c0f-01528c82cb92@os.amperecomputing.com>
Date: Tue, 13 Aug 2024 10:09:04 -0700
From: Yang Shi <yang@...amperecomputing.com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: "Christoph Lameter (Ampere)" <cl@...two.org>, will@...nel.org,
 anshuman.khandual@....com, david@...hat.com, scott@...amperecomputing.com,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [v5 PATCH] arm64: mm: force write fault for atomic RMW
 instructions



On 7/11/24 11:17 AM, Yang Shi wrote:
>
>
> On 7/11/24 10:43 AM, Catalin Marinas wrote:
>> On Wed, Jul 10, 2024 at 11:43:18AM -0700, Yang Shi wrote:
>>> On 7/10/24 2:22 AM, Catalin Marinas wrote:
>>>>> diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
>>>>> index 642bdf908b22..d30265d424e4 100644
>>>>> --- a/arch/arm64/mm/mmap.c
>>>>> +++ b/arch/arm64/mm/mmap.c
>>>>> @@ -19,7 +19,7 @@ static pgprot_t protection_map[16] 
>>>>> __ro_after_init = {
>>>>>           [VM_WRITE] = PAGE_READONLY,
>>>>>           [VM_WRITE | VM_READ] = PAGE_READONLY,
>>>>>           /* PAGE_EXECONLY if Enhanced PAN */
>>>>> -        [VM_EXEC]                                       = 
>>>>> PAGE_READONLY_EXEC,
>>>>> +        [VM_EXEC]                                       = 
>>>>> PAGE_EXECONLY,
>>>>>           [VM_EXEC | VM_READ] = PAGE_READONLY_EXEC,
>>>>>           [VM_EXEC | VM_WRITE] = PAGE_READONLY_EXEC,
>>>>>           [VM_EXEC | VM_WRITE | VM_READ] = PAGE_READONLY_EXEC,
>>>> In theory you'd need to change the VM_SHARED | VM_EXEC entry as well.
>>>> Otherwise it looks fine.
>>> Thanks. I just ran the same benchmark. Ran the modified 
>>> page_fault1_thread
>>> (trigger read fault) in 100 iterations with 160 threads on 160 
>>> cores. This
>>> should be the worst contention case and collected the max data (worst
>>> latency). It shows the patch may incur ~30% overhead for exec-only 
>>> case. The
>>> overhead should just come from the permission fault.
>>>
>>>      N           Min           Max        Median           Avg Stddev
>>> x 100        163840        219083        184471        183262 12593.229
>>> + 100        211198        285947        233608     238819.98 15253.967
>>> Difference at 95.0% confidence
>>>      55558 +/- 3877
>>>      30.3161% +/- 2.11555%
>>>
>>> This is a very extreme benchmark, I don't think any real life 
>>> workload will
>>> spend that much time (sys vs user) in page fault, particularly read 
>>> fault.
>>>
>>> With my atomic fault benchmark (populate 1G memory with atomic 
>>> instruction
>>> then manipulate the value stored in the memory in 100 iterations so 
>>> the user
>>> time is much longer than sys time), I saw around 13% overhead on sys 
>>> time
>>> due to the permission fault, but no noticeable change for user and real
>>> time.
>> Thanks for running these tests.
>>
>>> So the permission fault does incur noticeable overhead for read 
>>> fault on
>>> exec-only, but it may be not that bad for real life workloads.
>> So you are saying 3 faults is not that bad for real life workloads but
>> the 2 faults behaviour we have currently is problematic for OpenJDK. For
>> the OpenJDK case, I don't think the faulting is the main issue affecting
>> run-time performance but, over a longer run (or with more iterations in
>> this benchmark after the faulting in), you'd notice the breaking up of
>> the initial THP pages.

Gently ping...

>
> I meant the extra permission fault for exec-only should be ok since 
> the current implementation can't force write fault for exec-only 
> anyway. It does incur noticeable overhead for read fault, but I'm not 
> aware of any real life workloads are sensitive to read fault. The 
> benchmark is for a very extreme worst case.
>
>>
>> Do you have any OpenJDK benchmarks that show the problem? It might be
>> worth running them with this patch:
>>
>> https://lore.kernel.org/r/rjudrmg7nkkwfgviqeqluuww6wu6fdrgdsfimtmpjee7lkz2ej@iosd2f6pk4f7 
>>
>>
>> Or, if not, do you see any difference in the user time in your benchmark
>> with the above mm patch? In subsequent iterations, linear accesses are
>> not ideal for testing. Better to have some random accesses this 1GB of
>> memory (after the initial touching). That would mimic heap accesses a
>> bit better.
>
> I didn't try that patch. I think we discussed this before. This patch 
> can remove the THP shattering, we should be able to see some 
> improvement, but TBLI is still needed and its cost should be still 
> noticeable because the write protection fault still happens.
>
>>
>> Anyway, as it stands, I don't think we should merge this patch since it
>> does incur an additional penalty with exec-only mappings and it would
>> make things even worse for OpenJDK if distros change their default
>> toolchain flags at some point to generate exec-only ELF text sections.
>> While we could work around this by allowing the kernel to read the
>> exec-only user mapping with a new flavour of get_user() (toggling PAN as
>> well), I don't think it's worth it. Especially with the above mm change,
>> I think the benefits of this patch aren't justified. Longer term, I hope
>> that customers upgrade to OpenJDK v22 or, for proprietary tools, they
>> either follow the madvise() approach or wait for the Arm architects to
>> change the hardware behaviour.
>
> If the overhead for exec-only is a concern, I think we can just skip 
> exec-only segment for now, right? The exec-only is not that popular 
> yet. And if the users prefer security, typically they may be not that 
> sensitive to performance.
>
>>
>