[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c424e8a2-a771-e738-396c-24ac907b557f@redhat.com>
Date: Thu, 12 May 2022 09:28:40 +0200
From: David Hildenbrand <david@...hat.com>
To: HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@....com>
Cc: Miaohe Lin <linmiaohe@...wei.com>,
Oscar Salvador <osalvador@...e.de>,
Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Yang Shi <shy828301@...il.com>,
Muchun Song <songmuchun@...edance.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v1 0/4] mm, hwpoison: improve handling workload
related to hugetlb and memory_hotplug
>>>>
>>>> Once the problematic DIMM would actually get unplugged, the memory block devices
>>>> would get removed as well. So when hotplugging a new DIMM in the same
>>>> location, we could online that memory again.
>>>
>>> What about PG_hwpoison flags? struct pages are also freed and reallocated
>>> in the actual DIMM replacement?
>>
>> Once memory is offline, the memmap is stale and is no longer
>> trustworthy. It gets reinitialize during memory onlining -- so any
>> previous PG_hwpoison is overridden at least there. In some setups, we
>> even poison the whole memmap via page_init_poison() during memory offlining.
>>
>> Apart from that, we should be freeing the memmap in all relevant cases
>> when removing memory. I remember there are a couple of corner cases, but
>> we don't really have to care about that.
>
> OK, so there seems no need to manipulate struct pages for hwpoison in
> all relevant cases.
Right. When offlining a memory block, all we have to do is remember if
we stumbled over a hwpoisoned page and rememebr that inside the memory
block. Rejecting to online is then easy.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists