[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1e4fe5c8-4bac-087b-ae32-6fa2fa5b44f6@linaro.org>
Date: Thu, 2 Jun 2022 07:28:11 -0700
From: Tadeusz Struk <tadeusz.struk@...aro.org>
To: Michal Koutný <mkoutny@...e.com>
Cc: Tejun Heo <tj@...nel.org>, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org, Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Bui Quang Minh <minhquangbui99@...il.com>
Subject: Re: [PATCH 2/2] cgroup: Use separate work structs on css release path
On 6/2/22 04:47, Michal Koutný wrote:
> On Wed, Jun 01, 2022 at 05:40:51PM -0700, Tadeusz Struk<tadeusz.struk@...aro.org> wrote:
>> css_killed_ref_fn() will be called regardless of the value of refcnt (via percpu_ref_kill_and_confirm())
>> and it will only enqueue the css_killed_work_fn() to be called later.
>> Then css_put()->css_release() will be called before the css_killed_work_fn() will even
>> get a chance to run, and it will also*only* enqueue css_release_work_fn() to be called later.
>> The problem happens on the second enqueue. So there need to be something in place that
>> will make sure that css_killed_work_fn() is done before css_release() can enqueue
>> the second job.
> IIUC, here you describe the same scenario I broke down at [1].
Right, except the last css_put(), which I think is called from cgroup_kn_unlock()
See below.
>> Does it sound right?
> I added a parameter A there (that is sum of base and percpu references
> before kill_css()).
> I thought it fails because A == 1 (i.e. killing the base reference),
> however, that seems an unlikely situation (because cgroup code uses a
> "fuse" reference to pin css for offline_css()).
>
> So the remaining option (at least I find it more likely now) is that
> A == 0 (A < 0 would trigger the warning in
> percpu_ref_switch_to_atomic_rcu()), aka the ref imbalance. I hope we can
> get to the bottom of this with detailed enough tracing of gets/puts.
>
> Splitting the work struct is condradictive to the existing approach with
> the "fuse" reference.
>
> (BTW you also wrote On Wed, Jun 01, 2022 at 05:00:44PM -0700, Tadeusz Struk<tadeusz.struk@...aro.org> wrote:
>> The fact the css_release() is called (via cgroup_kn_unlock()) just after
>> kill_css() causes the css->destroy_work to be enqueued twice on the same WQ
>> (cgroup_destroy_wq), just with different function. This results in the
>> BUG: corrupted list in insert_work issue.
> Where do you see a critical css_release called from cgroup_kn_unlock()?
> I always observed the css_release() being called via
> percpu_ref_call_confirm_rcu() (in the original and subsequent syzbot
it goes like this:
cgroup_kn_unlock(kn)->cgroup_put(cgrp)->css_put(&cgrp->self), which
brings the refcnt to zero and triggers css_release().
I think what's missing is something that will serialize the kill
and release paths. I will try to put something together today.
--
Thanks,
Tadeusz
Powered by blists - more mailing lists