[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1662085A-4536-4020-957D-90FB262C6014@nvidia.com>
Date: Wed, 14 May 2025 11:49:29 -0400
From: Zi Yan <ziy@...dia.com>
To: David Hildenbrand <david@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
virtualization@...ts.linux.dev, "Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
Eugenio Pérez <eperezma@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Oscar Salvador <osalvador@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Brendan Jackman <jackmanb@...gle.com>, Johannes Weiner <hannes@...xchg.org>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>
Subject: Re: [PATCH v1 0/2] mm/memory_hotplug: introduce and use
PG_offline_skippable
On 14 May 2025, at 10:12, David Hildenbrand wrote:
> On 14.05.25 15:45, Zi Yan wrote:
>> On 14 May 2025, at 7:15, David Hildenbrand wrote:
>>
>>> This is a requirement for making PageOffline pages not have a refcount
>>> in the long future ("frozen"), and for reworking non-folio page migration
>>> in the near future.
>>>
>>> I have patches mostly ready to go to handle the latter. For turning all
>>> PageOffline() pages frozen, the non-folio page migration and memory
>>> ballooning drivers will have to be reworked first, to no longer rely on
>>> the refcount of PageOffline pages.
>>>
>>> Introduce PG_offline_skippable that only applies to PageOffline() pages --
>>> of course, reusing one of the existing PG_ flags for now -- and convert
>>> virtio-mem to make use of the new way: to allow for skipping PageOffline
>>> pages during memory offlining, treating them as if they would not be
>>> allocated.
>>
>
> Thanks for taking a look!
>
>> IIUC, based on Documentation/admin-guide/mm/memory-hotplug.rst,
>> to offline a page, the page first needs to be set PageOffline() to be
>
> PageOffline is not mentioned in there. :)
Sorry, I was mixing the code with the documentation as I was reading
both.
>
> Note that PageOffline() is a bit confusing because it's "Memory block online but page is logically offline (e.g., has a memmap that can be touched, but the page content should not be touched)".
So PageOffline() is before memory block offline, which is the first phase of
memory hotunplug.
>
> (memory block offline -> all pages offline and have effectively no state because the memmap is stale)
What do you mean by memmap is stale? When a memory block is offline, memmap is
still present, so pfn scanner can see these pages. pfn scanner checks memmap
to know that it should not touch these pages, right?
>
>> removed from page allocator.
>
> Usually, all pages are freed back to the buddy (isolated pageblock -> put onto the isolated list). Memory offlining code can then simply grab these "free" pages from the buddy -- no PageOffline involved.
>
> If something fails during memory offlining, these isolated pages are simply put back on the appropriate migratetype list and become ordinary free pages that can be allocated immediately.
I am familiar with this part. Then, when PageOffline is used?
From the comment in page-flags.h, I see two examples: inflated pages by balloon driver
and not onlined pages when onlining the section. These are two different operations:
1) inflated pages are going to be offline, 2) not onlined pages are going to be
online. But you mentioned above that Memory off lining code does not involve
PageOffline, so inflated pages by balloon driver is not part of memory offlining
code, but a different way of offlining pages. Am I getting it right?
I read a little bit more on memory ballooning and virtio-mem and understand
that memory ballooning still keeps the inflated page but guest cannot allocate
and use it, whereas virtio-mem and memory hotunplug remove the page from
Linux completely (i.e., Linux no longer sees the memory).
It seems that I am mixing memory offlining and memory hotunplug. IIUC,
memory offlining means no one can allocate and use the offlined memory, but
Linux still sees it; memory hotunplug means Linux no longer sees it (no related
memmap and other metadata). Am I getting it right?
>
> Some PageOffline pages can be migrated using the non-folio migration: this is done for memory ballooning (memory comapction). As they get migrated, they are freed back to the buddy, PageOffline() is cleared -- they become PageBuddy() -- and the above applies.
After a PageOffline page is migrated, the destination page becomes PageOffline, right?
OK, I see it in balloon_page_insert().
>
> Other PageOffline pages can be skipped during memory offlining (virtio-mem use case, what we are doing her). We don't want them to ever go through the buddy, especially because if memory offlining fails they must definitely not be treated like free pages that can be allocated immediately.
What do you mean by "skipped during memory offlining"? Are you implying when
virtio-mem is offlining some pages by marking it PageOffline and PG_offline_skippable,
someone else can do memory offlining in parallel?
>
> Next, the page is removed from its memory
>> block. When will PG_offline_skippable be used? The second phase when
>> the page is being removed from its memory block?
>
> PG_offline_skippable is used during memory offlining, while we look for any pages that are not PageBuddy (... or hwpoisoned ...), to migrate them off the memory so they get converted to PageBuddy.
>
> PageOffline + PageOfflineSkippable are checked on that phase, such that they don't require any migration.
Hmm, if you just do not want to get PageOffline migrated, not setting it
__PageMovable would work right? PageOffline + __PageMovable is used by
ballooning, as these inflated pages can be migrated. PageOffline without
__PageMovable should be virtio-mem. Am I missing any other user?
--
Best Regards,
Yan, Zi
Powered by blists - more mailing lists