[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <jtjtb7sn7kxl7rw7tfdo2sn73rlre4w3iuvbk5hrolyimq7ixx@mo4k6r663tx2>
Date: Thu, 19 Jun 2025 14:10:34 +0200
From: Michal Koutný <mkoutny@...e.com>
To: Zhongkun He <hezhongkun.hzk@...edance.com>
Cc: Tejun Heo <tj@...nel.org>, Waiman Long <llong@...hat.com>,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org, muchun.song@...ux.dev
Subject: Re: [External] Re: [PATCH] cpuset: introduce non-blocking
cpuset.mems setting option
On Thu, Jun 19, 2025 at 11:49:58AM +0800, Zhongkun He <hezhongkun.hzk@...edance.com> wrote:
> In our scenario, when we shrink the allowed cpuset.mems —for example,
> from nodes 1, 2, 3 to just nodes 2,3—there may still be a large number of pages
> residing on node 1. Currently, modifying cpuset.mems triggers synchronous memory
> migration, which results in prolonged and unacceptable service downtime under
> cgroup v2. This behavior has become a major blocker for us in adopting
> cgroup v2.
>
> Tejun suggested adding an interface to control the migration rate, and
> I plan to try that later.
It sounds unnecessarily not work-conserving and in principle adding
cond_resched()s (or eventually having a preemptible kernel) should
achieve the same. Or how would that project onto service metrics?
(But I'm not familiar with this migration path, thus I was asking about
the contention points.)
> However, we believe that the cpuset.migrate interface in cgroup v1 is
> also sufficient for our use case and is easier to work with. :)
Too easy I think, it'd make cpuset.mems only "advisory" constraint. (I
know it could be justified too but perhaps not as a solution to costly
migrations.)
Michal
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists