lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACSyD1MvwPT7i5_PnEp32seeb7X_svdCeFtN6neJ0=QPY1hDsw@mail.gmail.com>
Date: Thu, 22 May 2025 11:37:44 +0800
From: Zhongkun He <hezhongkun.hzk@...edance.com>
To: Tejun Heo <tj@...nel.org>
Cc: Waiman Long <llong@...hat.com>, hannes@...xchg.org, cgroups@...r.kernel.org, 
	linux-kernel@...r.kernel.org, muchun.song@...ux.dev
Subject: Re: [External] Re: [PATCH] cpuset: introduce non-blocking cpuset.mems
 setting option

On Thu, May 22, 2025 at 1:14 AM Tejun Heo <tj@...nel.org> wrote:
>
> On Wed, May 21, 2025 at 10:35:57AM +0800, Zhongkun He wrote:
> > On Tue, May 20, 2025 at 9:35 PM Waiman Long <llong@...hat.com> wrote:
> > >
> > > On 5/19/25 11:15 PM, Zhongkun He wrote:
> > > > Setting the cpuset.mems in cgroup v2 can trigger memory
> > > > migrate in cpuset. This behavior is fine for newly created
> > > > cgroups but it can cause issues for the existing cgroups.
> > > > In our scenario, modifying the cpuset.mems setting during
> > > > peak times frequently leads to noticeable service latency
> > > > or stuttering.
> > > >
> > > > It is important to have a consistent set of behavior for
> > > > both cpus and memory. But it does cause issues at times,
> > > > so we would hope to have a flexible option.
> > > >
> > > > This idea is from the non-blocking limit setting option in
> > > > memory control.
> > > >
> > > > https://lore.kernel.org/all/20250506232833.3109790-1-shakeel.butt@linux.dev/
> > > >
> > > > Signed-off-by: Zhongkun He <hezhongkun.hzk@...edance.com>
> > > > ---
> > > >   Documentation/admin-guide/cgroup-v2.rst |  7 +++++++
> > > >   kernel/cgroup/cpuset.c                  | 11 +++++++++++
> > > >   2 files changed, 18 insertions(+)
> > > >
> > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> > > > index 1a16ce68a4d7..d9e8e2a770af 100644
> > > > --- a/Documentation/admin-guide/cgroup-v2.rst
> > > > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > > > @@ -2408,6 +2408,13 @@ Cpuset Interface Files
> > > >       a need to change "cpuset.mems" with active tasks, it shouldn't
> > > >       be done frequently.
> > > >
> > > > +     If cpuset.mems is opened with O_NONBLOCK then the migration is
> > > > +     bypassed. This is useful for admin processes that need to adjust
> > > > +     the cpuset.mems dynamically without blocking. However, there is
> > > > +     a risk that previously allocated pages are not within the new
> > > > +     cpuset.mems range, which may be altered by move_pages syscall or
> > > > +     numa_balance.
>
> I don't think this is a good idea. O_NONBLOCK means "don't wait", not "skip
> this".

Yes, I agree.  However, we have been experiencing this issue for a long time,
so we hope to have an option to disable memory migration in v2.

Would it be possible to re-enable the memory.migrate interface and
disable memory migration by default in v2?

Alternatively, could we introduce an option in cpuset.mems to explicitly
indicate that memory migration should not occur?

Please feel free to share any suggestions you might have.

>
> Thanks.

>
> --
> tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ