lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d0e94a6e-6296-495a-b10a-569d41a65adb@redhat.com>
Date: Wed, 14 May 2025 19:28:06 +0200
From: David Hildenbrand <david@...hat.com>
To: Zi Yan <ziy@...dia.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 virtualization@...ts.linux.dev, "Michael S. Tsirkin" <mst@...hat.com>,
 Jason Wang <jasowang@...hat.com>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
 Eugenio Pérez <eperezma@...hat.com>,
 Andrew Morton <akpm@...ux-foundation.org>, Oscar Salvador
 <osalvador@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
 Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
 Brendan Jackman <jackmanb@...gle.com>, Johannes Weiner <hannes@...xchg.org>,
 "Matthew Wilcox (Oracle)" <willy@...radead.org>
Subject: Re: [PATCH v1 0/2] mm/memory_hotplug: introduce and use
 PG_offline_skippable

>>
>> Note that PageOffline() is a bit confusing because it's "Memory block online but page is logically offline (e.g., has a memmap that can be touched, but the page content should not be touched)".
> 
> So PageOffline() is before memory block offline, which is the first phase of
> memory hotunplug.

Yes.

> 
>>
>> (memory block offline -> all pages offline and have effectively no state because the memmap is stale)
> 
> What do you mean by memmap is stale? When a memory block is offline, memmap is
> still present, so pfn scanner can see these pages. pfn scanner checks memmap
> to know that it should not touch these pages, right?

See pfn_to_online_page() for exactly that use case.

For an offline memory section (either because it was just added or 
because it was just offlined), the memmap is assumed to contain garbage 
and should not be touched.

See remove_pfn_range_from_zone() -> page_init_poison().

> 
>>
>>> removed from page allocator.
>>
>> Usually, all pages are freed back to the buddy (isolated pageblock -> put onto the isolated list). Memory offlining code can then simply grab these "free" pages from the buddy -- no PageOffline involved.
>>
>> If something fails during memory offlining, these isolated pages are simply put back on the appropriate migratetype list and become ordinary free pages that can be allocated immediately.
> 
> I am familiar with this part. Then, when PageOffline is used?
> 
>  From the comment in page-flags.h, I see two examples: inflated pages by balloon driver
> and not onlined pages when onlining the section. These are two different operations:
> 1) inflated pages are going to be offline, 2) not onlined pages are going to be
> online. But you mentioned above that Memory off lining code does not involve
> PageOffline, so inflated pages by balloon driver is not part of memory offlining
> code, but a different way of offlining pages. Am I getting it right?

Yes. PageOffline means logically offline, for whatever reason someone 
decides to turn pages logically offline.

Memory ballooning uses and virtio-mem are two users, there are more.

> 
> I read a little bit more on memory ballooning and virtio-mem and understand
> that memory ballooning still keeps the inflated page but guest cannot allocate
> and use it, whereas virtio-mem and memory hotunplug remove the page from
> Linux completely (i.e., Linux no longer sees the memory).

In virtio-mem terms, they are considered "fake offline" -- memory 
behaves as if it would never have been onlined, but there is a memmap 
for it. Like a (current) memory hole.

> 
> It seems that I am mixing memory offlining and memory hotunplug. IIUC,
> memory offlining means no one can allocate and use the offlined memory, but
> Linux still sees it; memory hotunplug means Linux no longer sees it (no related
> memmap and other metadata). Am I getting it right?

The doc has this "Phases of Memory Hotplug" description, where it is 
roughly divided into that, yes.

> 
>>
>> Some PageOffline pages can be migrated using the non-folio migration: this is done for memory ballooning (memory comapction). As they get migrated, they are freed back to the buddy, PageOffline() is cleared -- they become PageBuddy() -- and the above applies.
> 
> After a PageOffline page is migrated, the destination page becomes PageOffline, right?
> OK, I see it in balloon_page_insert().

Yes.

> 
>>
>> Other PageOffline pages can be skipped during memory offlining (virtio-mem use case, what we are doing her). We don't want them to ever go through the buddy, especially because if memory offlining fails they must definitely not be treated like free pages that can be allocated immediately.
> 
> What do you mean by "skipped during memory offlining"? Are you implying when
> virtio-mem is offlining some pages by marking it PageOffline and PG_offline_skippable,
> someone else can do memory offlining in parallel?

It could happen (e.g., manually offline a Linux memory block using 
sysfs), but that is not the primary use case.

virtio-mem unplugs memory in the following sequence:

1) alloc_contig_range() small blocks (e.g., 2 MiB)

2) Report the blocks to the hypervisor

3) Mark them fake-offline: PageOffline (+ PageOfflineSkippable now)

Once all small blocks that comprise a Linux memory block (e.g., 128 MiB) 
are fake-offline, offline the memory block and remove the memory using 
offline_and_remove_memory().

In that operation -- offline_and_remove_memory() -- memory offlining 
code must be able to skip these PageOffline pages, otherwise 
offline_and_remove_memory() will just fail, saying that there are 
unmovable pages in there.

> 
>>
>> Next, the page is removed from its memory
>>> block. When will PG_offline_skippable be used? The second phase when
>>> the page is being removed from its memory block?
>>
>> PG_offline_skippable is used during memory offlining, while we look for any pages that are not PageBuddy (... or hwpoisoned ...), to migrate them off the memory so they get converted to PageBuddy.
>>
>> PageOffline + PageOfflineSkippable are checked on that phase, such that they don't require any migration.
> 
> Hmm, if you just do not want to get PageOffline migrated, not setting it
> __PageMovable would work right? PageOffline + __PageMovable is used by
> ballooning, as these inflated pages can be migrated. PageOffline without
> __PageMovable should be virtio-mem. Am I missing any other user?

Sure. Just imagine !CONFIG_BALLOON_COMPACTION.

In summary, we have

1) Migratable PageOffline pages (balloon compaction)

2) Unmigratable PageOffline pages (e.g., XEN balloon, hyper-v balloon,
    memtrace, in the future likely some memory holes, ... )

3) Skippable PageOffline pages (virtio-mem)

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ