[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250922160308.524be6ba4d418886095ab223@linux-foundation.org>
Date: Mon, 22 Sep 2025 16:03:08 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Tejun Heo <tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>, Michal
Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
Muchun Song <muchun.song@...ux.dev>, Alexei Starovoitov <ast@...nel.org>,
Peilin Ye <yepeilin@...gle.com>, Kumar Kartikeya Dwivedi
<memxor@...il.com>, bpf@...r.kernel.org, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org, Meta kernel team
<kernel-team@...a.com>, Michal Hocko <mhocko@...e.com>
Subject: Re: [PATCH v2] memcg: skip cgroup_file_notify if spinning is not
allowed
On Mon, 22 Sep 2025 15:02:03 -0700 Shakeel Butt <shakeel.butt@...ux.dev> wrote:
> Generally memcg charging is allowed from all the contexts including NMI
> where even spinning on spinlock can cause locking issues. However one
> call chain was missed during the addition of memcg charging from any
> context support. That is try_charge_memcg() -> memcg_memory_event() ->
> cgroup_file_notify().
>
> The possible function call tree under cgroup_file_notify() can acquire
> many different spin locks in spinning mode. Some of them are
> cgroup_file_kn_lock, kernfs_notify_lock, pool_workqeue's lock. So, let's
> just skip cgroup_file_notify() from memcg charging if the context does
> not allow spinning.
>
> Alternative approach was also explored where instead of skipping
> cgroup_file_notify(), we defer the memcg event processing to irq_work
> [1]. However it adds complexity and it was decided to keep things simple
> until we need more memcg events with !allow_spinning requirement.
What are the downsides here? Inaccurate charging obviously, but how
might this affect users?
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2307,12 +2307,13 @@ static int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask,
> bool drained = false;
> bool raised_max_event = false;
> unsigned long pflags;
> + bool allow_spinning = gfpflags_allow_spinning(gfp_mask);
>
Does this affect only the problematic call chain which you have
identified, or might other callers be undesirably affected?
Powered by blists - more mailing lists