lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a84ede7d-60ad-3df9-77ae-cd0dbc545b85@gentwo.org>
Date: Wed, 24 Sep 2025 16:09:47 -0700 (PDT)
From: "Christoph Lameter (Ampere)" <cl@...two.org>
To: Joshua Hahn <joshua.hahnjy@...il.com>
cc: Andrew Morton <akpm@...ux-foundation.org>, 
    Johannes Weiner <hannes@...xchg.org>, Chris Mason <clm@...com>, 
    Kiryl Shutsemau <kirill@...temov.name>, 
    Brendan Jackman <jackmanb@...gle.com>, Michal Hocko <mhocko@...e.com>, 
    Suren Baghdasaryan <surenb@...gle.com>, Vlastimil Babka <vbabka@...e.cz>, 
    Zi Yan <ziy@...dia.com>, linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
    kernel-team@...a.com
Subject: Re: [PATCH v2 2/4] mm/page_alloc: Perform appropriate batching in
 drain_pages_zone

On Wed, 24 Sep 2025, Joshua Hahn wrote:

> drain_pages_zone completely drains a zone of its pcp free pages by
> repeatedly calling free_pcppages_bulk until pcp->count reaches 0.
> In this loop, it already performs batched calls to ensure that
> free_pcppages_bulk isn't called to free too many pages at once, and
> relinquishes & reacquires the lock between each call to prevent
> lock starvation from other processes.


drain_pages_zone() operates on a lock in a percpu area. The lock is
specific to a cpu and should not be contended in normal operatons unless
there is significant remote access to the per cpu queues.

This seems to be then from __drain_all_pages() running on multiple cpus
frequently. There is no point in concurrently draining the per cpu pages
of all processors from multiple remote processors and we have a
pcpu_drain_mutex to prevent that from happening.

So we need an explanation as to why there is such high contention on the
lock first before changing the logic here.

The current logic seems to be designed to prevent the lock contention you
are seeing.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ