lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <936f8adf-a8a5-45eb-b5a3-297773918f7c@redhat.com>
Date: Wed, 7 Aug 2024 17:23:48 +0200
From: David Hildenbrand <david@...hat.com>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: agordeev@...ux.ibm.com, akpm@...ux-foundation.org,
 alexghiti@...osinc.com, aou@...s.berkeley.edu, ardb@...nel.org,
 arnd@...db.de, bhe@...hat.com, bjorn@...osinc.com,
 borntraeger@...ux.ibm.com, bp@...en8.de, catalin.marinas@....com,
 chenhuacai@...nel.org, chenjiahao16@...wei.com, christophe.leroy@...roup.eu,
 dave.hansen@...ux.intel.com, dawei.li@...ngroup.cn,
 gerald.schaefer@...ux.ibm.com, gor@...ux.ibm.com, hca@...ux.ibm.com,
 hpa@...or.com, kent.overstreet@...ux.dev, kernel@...0n.name,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org,
 linux-riscv@...ts.infradead.org, linux-s390@...r.kernel.org,
 loongarch@...ts.linux.dev, luto@...nel.org, maobibo@...ngson.cn,
 mark.rutland@....com, mcgrof@...nel.org, mingo@...hat.com,
 mpe@...erman.id.au, muchun.song@...ux.dev, namcao@...utronix.de,
 naveen@...nel.org, npiggin@...il.com, osalvador@...e.de, palmer@...belt.com,
 paul.walmsley@...ive.com, peterz@...radead.org, philmd@...aro.org,
 rdunlap@...radead.org, rientjes@...gle.com, rppt@...nel.org,
 ryan.roberts@....com, souravpanda@...gle.com, svens@...ux.ibm.com,
 tglx@...utronix.de, tzimmermann@...e.de, will@...nel.org, x86@...nel.org
Subject: Re: [PATCH 2/2] mm: keep nid around during hot-remove

On 07.08.24 16:40, Pasha Tatashin wrote:
> On Wed, Aug 7, 2024 at 7:50 AM David Hildenbrand <david@...hat.com> wrote:
>>
>> On 07.08.24 13:32, David Hildenbrand wrote:
>>> On 07.08.24 00:14, Pasha Tatashin wrote:
>>>> nid is needed during memory hot-remove in order to account the
>>>> information about the memmap overhead that is being removed.
>>>>
>>>> In addition, we cannot use page_pgdat(pfn_to_page(pfn)) during
>>>> hotremove after remove_pfn_range_from_zone().
>>>>
>>>> We also cannot determine nid from walking through memblocks after
>>>> remove_memory_block_devices() is called.
>>>>
>>>> Therefore, pass nid down from the beginning of hotremove to where
>>>> it is used for the accounting purposes.
>>>
>>> I was happy to finally remove that nid parameter for good in:
>>>
>>> commit 65a2aa5f482ed0c1b5afb9e6b0b9e0b16bb8b616
>>> Author: David Hildenbrand <david@...hat.com>
>>> Date:   Tue Sep 7 19:55:04 2021 -0700
>>>
>>>        mm/memory_hotplug: remove nid parameter from arch_remove_memory()
>>>
>>> To ask the real question: Do we really need this counter per-nid at all?
>>>
>>> Seems to over-complicate things.
>>
>> Case in point: I think the handling is wrong?
>>
>> Just because some memory belongs to a nid doesn't mean that the vmemmap
>> was allocated from that nid?
> 
> I believe when we hot-add we use nid for the memory that is being
> added to account vmemmap, and when we do hot-remove we also use nid of
> the memory that is being removed. But, you are correct, this does not
> guarantee that the actual vmemmap memory is being allocated or removed
> from the given nid.

Right. For boot memory that we might want to unplug later it might be 
different. I recall that with "movable_node", we might end up allocating 
the vmemmap from remote nodes, such that all memory of a node stays 
movable. That's why __earlyonly_bootmem_alloc() ends up calling 
memblock_alloc_try_nid_raw(), to fallback to other nodes if required.

> 
>> Wouldn't we want to look at the actual nid the vmemmap page belongs to
>> that we are removing?
> 
> I am now looking into converting this counter to be system wide, i.e.
> vm_event, it is all done under hotplug lock, so there is no
> contention.

That would be easiest, assuming per-node information is not strictly 
required for now.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ