lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 26 May 2022 08:15:18 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Michal Koutný <mkoutny@...e.com>
Cc:     cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
        Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Bui Quang Minh <minhquangbui99@...il.com>,
        Tadeusz Struk <tadeusz.struk@...aro.org>
Subject: Re: [PATCH 2/2] cgroup: Use separate work structs on css release path

Hello, Michal.

On Thu, May 26, 2022 at 11:56:34AM +0200, Michal Koutný wrote:
> // ref=A: initial state
> kill_css()
>   css_get // ref+=F == A+F: fuse
>   percpu_ref_kill_and_confirm
>     __percpu_ref_switch_to_atomic
>       percpu_ref_get
>         // ref += 1 == A+F+1: atomic mode, self-protection
>     percpu_ref_put
>       // ref -= 1 == A+F: kill the base reference
>   [via rcu]
>   percpu_ref_switch_to_atomic_rcu
>     percpu_ref_call_confirm_rcu
>       css_killed_ref_fn == refcnt.confirm_switch
>         queue_work(css->destroy_work)        (1)
>                                                      [via css->destroy_work]
>                                                      css_killed_work_fn == wq.func
>                                                        offline_css() // needs fuse
>                                                        css_put // ref -= F == A: de-fuse
>       percpu_ref_put
>         // ref -= 1 == A-1: remove self-protection
>         css_release                                   // A <= 1 -> 2nd queue_work explodes!

I'm not sure I'm following it but it's perfectly fine to re-use the work
item at this point. The work item actually can be re-cycled from the very
beginning of the work function. The only thing we need to make sure is that
we don't css_put() prematurely to avoid it being freed while we're using it.

For the sharing to be a problem, we should be queueing the release work item
while the destroy instance is still pending, and if that is the case, it
doesn't really matter whether we use two separate work items or not. We're
already broken and would just be shifting the problem to explode elsewhere.

The only possibility that I can think of is that somehow we're ending up
with an extra css_put() somewhere thus triggering the release path
prematurely. If that's the case, we'll prolly need to trace get/puts to find
out who's causing the ref imbalance.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ