linux-kernel - Re: [PATCH 0/7] riscv: Memory Hot(Un)Plug support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c7e3152b-4fdd-cb93-a4f0-06502eab59b1@redhat.com>
Date:   Mon, 22 May 2023 10:21:13 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Björn Töpel <bjorn@...nel.org>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        linux-riscv@...ts.infradead.org,
        Anshuman Khandual <anshuman.khandual@....com>
Cc:     Björn Töpel <bjorn@...osinc.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Oscar Salvador <osalvador@...e.de>,
        virtualization@...ts.linux-foundation.org, linux@...osinc.com,
        Alexandre Ghiti <alexghiti@...osinc.com>
Subject: Re: [PATCH 0/7] riscv: Memory Hot(Un)Plug support

On 21.05.23 11:15, Björn Töpel wrote:
> Hi David and Anshuman!
> 
> Björn Töpel <bjorn@...nel.org> writes:
> 
>> David Hildenbrand <david@...hat.com> writes:
>>
>>> On 12.05.23 16:57, Björn Töpel wrote:
>>>> From: Björn Töpel <bjorn@...osinc.com>
>>>>
>>>> Memory Hot(Un)Plug support for the RISC-V port
>>>> ==============================================
>>
>> [...]
>>
>>>
>>> Cool stuff! I'm fairly busy right now, so some high-level questions upfront:
>>
>> No worries, and no rush! I'd say the v1 series was mainly for the RISC-V
>> folks, and I've got tons of (offline) comments from Alex -- and with
>> your comments below some more details to figure out.
> 
> One of the major issues with my v1 patch is around init_mm page table
> synchronization, and that'll be part of the v2.
> 
> I've noticed there's a quite a difference between x86-64 and arm64 in
> terms of locking, when updating (add/remove) the init_mm table. x86-64
> uses the usual page table locking mechanisms (used by the generic
> kernel functions), whereas arm64 does not.
> 
> How does arm64 manage to mix the "lock-less" updates (READ/WRITE_ONCE,
> and fences in set_p?d+friends), with the generic kernel ones that uses
> the regular page locking mechanism?
> 
> I'm obviously missing something about the locking rules for memory hot
> add/remove... I've been reading the arm64 memory hot add/remove
> series, but none the wiser! ;-)

In general, memory hot(un)plug is serialized on a high level using the 
mem_hotplug_lock. For example, in pagemap_range() or in 
add_memory_resource(), we grab that lock in write mode. So we'll never 
see memory getting added/removed concurrently from the direct map.

 From what I recall, the locking on the arch level is required for 
concurrent (direct mapping) page table modifications that target virtual 
address ranges adjacent to the ranges we hot(un)plug:
CONFIG_ARCH_HAS_SET_DIRECT_MAP and vmalloc come to mind.

For example, if a range would be mapped using a large PUD, but we have 
to unplug it partially (unplugging memory part of bootmem), we'd have to 
replace the large PUD by a PMD table first. That change (that could 
affect other concurrent page table walkers/operations) has to be 
synchronized.

I guess to which degree this applies to riscv depends on the virtual 
memory layout, direct mapping granularity and features (e.g., 
CONFIG_ARCH_HAS_SET_DIRECT_MAP).

One trick that arm64 implements is, that it only allows hotunplugging 
memory that was hotplugged (see prevent_bootmem_remove_notifier()). That 
might just rule out such problematic cases that require locking 
completely, and the high-level mem_hotplug_lock sufficient.

-- 
Thanks,

David / dhildenb