[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47a8b4d5-f4d2-4772-b1b2-ee96bc21e742@kernel.org>
Date: Wed, 3 Dec 2025 09:51:52 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Michal Hocko <mhocko@...e.com>, Gregory Price <gourry@...rry.net>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Aboorva Devarajan <aboorvad@...ux.ibm.com>, vbabka@...e.cz,
surenb@...gle.com, jackmanb@...gle.com, hannes@...xchg.org, ziy@...dia.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Oscar Salvador <OSalvador@...e.com>
Subject: Re: [PATCH] mm/page_alloc: make percpu_pagelist_high_fraction reads
lock-free
On 12/3/25 09:42, Michal Hocko wrote:
> On Wed 03-12-25 03:35:51, Gregory Price wrote:
>> On Wed, Dec 03, 2025 at 09:27:26AM +0100, Michal Hocko wrote:
>>> Let me add Oscar and David.
>>>
>>> On Mon 01-12-25 09:41:12, Andrew Morton wrote:
>>>> On Mon, 1 Dec 2025 11:30:09 +0530 Aboorva Devarajan <aboorvad@...ux.ibm.com> wrote:
>>>>
>>>>> When page isolation loops indefinitely during memory offline, reading
>>>>> /proc/sys/vm/percpu_pagelist_high_fraction blocks on pcp_batch_high_lock,
>>>>> causing hung task warnings.
>>>>
>>>> That's pretty bad behavior.
>>>>
>>>> I wonder if there are other problems which can be caused by this
>>>> lengthy hold time.
>>>
>>> pcp_batch_high_lock is not taken in any performance critical path. It is
>>> true that memory offlining can take long when memory is not free but I
>>> am not sure we can do much better. I guess we could check contention on
>>> the lock and drop it to make cpu hotplug events and
>>> sysctl_min_unmapped_ratio_sysctl_handler smoother. The question is
>>> whether this is a practical problem hit in real life.
>>>
>>
>> I just today hit a scenario where offlining was blocked on migration
>> failures that took an exceedingly long time to offline (many minutes)
>> even on a relatively small block (256MB).
>>
>> Now that I'm looking at the double-do-while loop in memory_hotplug.c
>>
>> zone_pcp_disable(zone); /* (pcp_batch_high_lock) */
>> ...
>> do {
>> do {
>> ...
>> cond_resched();
>> ret = scan_movable_pages(pfn, end_pfn, &pfn);
>> if (!ret) {
>> /*
>> * TODO: fatal migration failures should bail
>> * out
>> */
>> do_migrate_range(pfn, end_pfn);
>> }
>> } while (!ret);
>> } while (ret);
>> ...
>> zone_pcp_enable(zone); /* (pcp_batch_high_lock) */
>>
>>
>> Maybe it's time to implement the bail out?
>
> That would be great but can we tell transient from permanent migration
> failures? Maybe long term pins could be treated as permanent failure.
Did we try offline a ZONE_MOVABLE block or a ZONE_NORMAL block? In case
of ZONE_MOABLE, bailing out is not really the right thing to do.
--
Cheers
David
Powered by blists - more mailing lists