[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220602114705.GB21320@blackbody.suse.cz>
Date: Thu, 2 Jun 2022 13:47:05 +0200
From: Michal Koutný <mkoutny@...e.com>
To: Tadeusz Struk <tadeusz.struk@...aro.org>
Cc: Tejun Heo <tj@...nel.org>, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org, Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Bui Quang Minh <minhquangbui99@...il.com>
Subject: Re: [PATCH 2/2] cgroup: Use separate work structs on css release path
On Wed, Jun 01, 2022 at 05:40:51PM -0700, Tadeusz Struk <tadeusz.struk@...aro.org> wrote:
> css_killed_ref_fn() will be called regardless of the value of refcnt (via percpu_ref_kill_and_confirm())
> and it will only enqueue the css_killed_work_fn() to be called later.
> Then css_put()->css_release() will be called before the css_killed_work_fn() will even
> get a chance to run, and it will also *only* enqueue css_release_work_fn() to be called later.
> The problem happens on the second enqueue. So there need to be something in place that
> will make sure that css_killed_work_fn() is done before css_release() can enqueue
> the second job.
IIUC, here you describe the same scenario I broke down at [1].
> Does it sound right?
I added a parameter A there (that is sum of base and percpu references
before kill_css()).
I thought it fails because A == 1 (i.e. killing the base reference),
however, that seems an unlikely situation (because cgroup code uses a
"fuse" reference to pin css for offline_css()).
So the remaining option (at least I find it more likely now) is that
A == 0 (A < 0 would trigger the warning in
percpu_ref_switch_to_atomic_rcu()), aka the ref imbalance. I hope we can
get to the bottom of this with detailed enough tracing of gets/puts.
Splitting the work struct is condradictive to the existing approach with
the "fuse" reference.
(BTW you also wrote On Wed, Jun 01, 2022 at 05:00:44PM -0700, Tadeusz Struk <tadeusz.struk@...aro.org> wrote:
> The fact the css_release() is called (via cgroup_kn_unlock()) just after
> kill_css() causes the css->destroy_work to be enqueued twice on the same WQ
> (cgroup_destroy_wq), just with different function. This results in the
> BUG: corrupted list in insert_work issue.
Where do you see a critical css_release called from cgroup_kn_unlock()?
I always observed the css_release() being called via
percpu_ref_call_confirm_rcu() (in the original and subsequent syzbot
logs.))
Thanks,
Michal
[1] https://lore.kernel.org/r/Yo7KfEOz92kS2z5Y@blackbook/
Powered by blists - more mailing lists