[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250922165509.3fe07892054bb9e149e7cc06@linux-foundation.org>
Date: Mon, 22 Sep 2025 16:55:09 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Tejun Heo <tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>, Michal
Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
Muchun Song <muchun.song@...ux.dev>, Alexei Starovoitov <ast@...nel.org>,
Peilin Ye <yepeilin@...gle.com>, Kumar Kartikeya Dwivedi
<memxor@...il.com>, bpf@...r.kernel.org, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org, Meta kernel team
<kernel-team@...a.com>, Michal Hocko <mhocko@...e.com>
Subject: Re: [PATCH v2] memcg: skip cgroup_file_notify if spinning is not
allowed
On Mon, 22 Sep 2025 16:39:53 -0700 Shakeel Butt <shakeel.butt@...ux.dev> wrote:
> On Mon, Sep 22, 2025 at 04:04:43PM -0700, Andrew Morton wrote:
> > On Mon, 22 Sep 2025 15:02:03 -0700 Shakeel Butt <shakeel.butt@...ux.dev> wrote:
> >
> > > Generally memcg charging is allowed from all the contexts including NMI
> > > where even spinning on spinlock can cause locking issues. However one
> > > call chain was missed during the addition of memcg charging from any
> > > context support. That is try_charge_memcg() -> memcg_memory_event() ->
> > > cgroup_file_notify().
> > >
> > > The possible function call tree under cgroup_file_notify() can acquire
> > > many different spin locks in spinning mode. Some of them are
> > > cgroup_file_kn_lock, kernfs_notify_lock, pool_workqeue's lock. So, let's
> > > just skip cgroup_file_notify() from memcg charging if the context does
> > > not allow spinning.
> > >
> > > Alternative approach was also explored where instead of skipping
> > > cgroup_file_notify(), we defer the memcg event processing to irq_work
> > > [1]. However it adds complexity and it was decided to keep things simple
> > > until we need more memcg events with !allow_spinning requirement.
> > >
> > > Link: https://lore.kernel.org/all/5qi2llyzf7gklncflo6gxoozljbm4h3tpnuv4u4ej4ztysvi6f@x44v7nz2wdzd/ [1]
> > > Signed-off-by: Shakeel Butt <shakeel.butt@...ux.dev>
> > > Acked-by: Michal Hocko <mhocko@...e.com>
> >
> > Fixes a possible kernel deadlock, yes?
> >
> > Is a cc:stable appropriate and can we identify a Fixes: target?
> >
> > Thanks.
> >
> > (Did it ever generate lockdep warnings?)
>
> The report is here:
> https://lore.kernel.org/all/20250905061919.439648-1-yepeilin@google.com/
>
> I am not sure about the Fixes tag though or more like which one to put
> in the Fixes as we recently started supporting memcg charging for NMI
> context or allowing bpf programs to do memcg charged allocations in
> recursive context (see the above report for this recursive call chain).
> There is no single commit which can be blamed here.
I tend to view the Fixes: as us suggesting which kernel versions should
be patched. I'm suspecting that's 6.16+, so using the final relevant
patch in that release as a Fixes: target would work.
Powered by blists - more mailing lists