[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <606fbf9a-c9ba-4f08-a708-db38fe6065ce@redhat.com>
Date: Mon, 9 Dec 2024 11:56:12 +0100
From: David Hildenbrand <david@...hat.com>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: yuzhao@...gle.com, akpm@...ux-foundation.org, willy@...radead.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm: remove an avoidable load of page refcount in
page_ref_add_unless
On 09.12.24 11:25, Mateusz Guzik wrote:
> On Mon, Dec 9, 2024 at 10:28 AM David Hildenbrand <david@...hat.com> wrote:
>>
>> On 07.12.24 09:29, Mateusz Guzik wrote:
>>> Explicitly pre-checking the count adds nothing as atomic_add_unless
>>> starts with doing the same thing. iow no functional changes.
>>
>> I recall that we added that check because with the hugetlb vmemmap
>> optimization, some of the tail pages we don't ever expect to be modified
>> (because they are fake-duplicated) might be mapped R/O.
>>
>> If the arch implementation of atomic_add_unless() would trigger an
>> unconditional write fault, we'd be in trouble. That would likely only be
>> the case if the arch provides a dedicate instruction.
>>
>> atomic_add_unless()->raw_atomic_add_unless()
>>
>> Nobody currently defines arch_atomic_add_unless().
>>
>> raw_atomic_fetch_add_unless()->arch_atomic_fetch_add_unless() is defined
>> on some architectures.
>>
>> I scanned some of the inline-asm, and I think most of them perform a
>> check first.
>>
>
> Huh.
>
> Some arch triggering a write fault despite not changing the value is
> not something I thought about. Sounds pretty broken to me if any arch
> was to do it, but then stranger things did happen.
Yeah, it really depends on what the architecture defines. For example,
on s390x for "COMPARE AND SWAP" the spec states something like
"When the result of the comparison is unequal, the
second operand is loaded at the first-operand loca-
tion, and the second-operand location remains
unchanged. However, on some models, the contents
may be fetched and subsequently stored back
unchanged at the second-operand location. This
update appears to be a block-concurrent interlocked-
update reference as observed by other CPUs."
So there might be an unconditional store on an instruction where one
would not expect it.
Something similar-but-different recently popped up on aarch64, which
does what one would expect:
"The atomic RMW instructions, for example, ldadd, actually does load +
add + store in one instruction, it will trigger two page faults per the
ARM64 architecture spec, the first fault is a read fault, the second
fault is a write fault.
Some applications use atomic RMW instructions to populate memory, for
example, openjdk uses atomic-add-0 to do pretouch (populate heap memory
at launch time) between v18 and v22 in order to permit use of memory
concurrently with pretouch." [1]
And Christoph commented:
"x86 does not do a read fault on atomics so we have an issue htere." [2]
I did not check if that is actually true on x86.
>
> However, if this is seen as a real concern, then I think the best way
> forward is to drop the patch (and maybe instead add a comment what's
> up with the extra load).
I assume we're currently fine because no architecture actually defines
such an instruction that would be a problem for add_unless.
[1] https://lore.kernel.org/linux-arm-kernel/Zmw1jltdkMrTrT_l@arm.com/T/
[2]
https://lore.kernel.org/linux-arm-kernel/c1ba9ba3-b0d6-4c6c-d628-614751d737c2@gentwo.org/
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists