lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <382839fc-ea63-421a-8397-72cb35dd8052@redhat.com>
Date: Thu, 22 May 2025 19:21:38 +0200
From: David Hildenbrand <david@...hat.com>
To: Zi Yan <ziy@...dia.com>
Cc: Bharata B Rao <bharata@....com>, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, Jonathan.Cameron@...wei.com, dave.hansen@...el.com,
 gourry@...rry.net, hannes@...xchg.org, mgorman@...hsingularity.net,
 mingo@...hat.com, peterz@...radead.org, raghavendra.kt@....com,
 riel@...riel.com, rientjes@...gle.com, sj@...nel.org, weixugc@...gle.com,
 willy@...radead.org, ying.huang@...ux.alibaba.com, dave@...olabs.net,
 nifan.cxl@...il.com, joshua.hahnjy@...il.com, xuezhengchu@...wei.com,
 yiannis@...corp.com, akpm@...ux-foundation.org
Subject: Re: [RFC PATCH v0 2/2] mm: sched: Batch-migrate misplaced pages

On 22.05.25 18:38, Zi Yan wrote:
> On 22 May 2025, at 12:26, David Hildenbrand wrote:
> 
>> On 22.05.25 18:24, Zi Yan wrote:
>>> On 22 May 2025, at 12:11, David Hildenbrand wrote:
>>>
>>>> On 21.05.25 10:02, Bharata B Rao wrote:
>>>>> Currently the folios identified as misplaced by the NUMA
>>>>> balancing sub-system are migrated one by one from the NUMA
>>>>> hint fault handler as and when they are identified as
>>>>> misplaced.
>>>>>
>>>>> Instead of such singe folio migrations, batch them and
>>>>> migrate them at once.
>>>>>
>>>>> Identified misplaced folios are isolated and stored in
>>>>> a per-task list. A new task_work is queued from task tick
>>>>> handler to migrate them in batches. Migration is done
>>>>> periodically or if pending number of isolated foios exceeds
>>>>> a threshold.
>>>>
>>>> That means that these pages are effectively unmovable for other purposes (CMA, compaction, long-term pinning, whatever) until that list was drained.
>>>>
>>>> Bad.
>>>
>>> Probably we can mark these pages and when others want to migrate the page,
>>> get_new_page() just looks at the page's target node and get a new page from
>>> the target node.
>>
>> How do you envision that working when CMA needs to migrate this exact page to a different location?
>>
>> It cannot isolate it for migration because ... it's already isolated ... so it will give up.
>>
>> Marking might not be easy I assume ...
> 
> I guess you mean we do not have any extra bit to indicate this page is isolated,
> but it can be migrated. My point is that if this page is going to be migrated
> due to other reasons, like CMA, compaction, why not migrate it to the target
> node instead of moving it around within the same node.

I think we'd have to identify that

a) This page is isolate for migration (could be isolated for other
    reasons)

b) The one responsible for the isolation is numa code (could be someone
    else)

c) We're allowed to grab that page from that list (IOW sync against
    others, and especially also against), to essentially "steal" the
    isolated page.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ