lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87im7yc2bu.fsf@oracle.com>
Date:   Thu, 14 Jan 2021 22:21:25 -0500
From:   Daniel Jordan <daniel.m.jordan@...cle.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Alexey Klimov <aklimov@...hat.com>
Cc:     linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
        yury.norov@...il.com, tglx@...utronix.de, jobaker@...hat.com,
        audralmitchel@...il.com, arnd@...db.de, gregkh@...uxfoundation.org,
        rafael@...nel.org, tj@...nel.org, lizefan.x@...edance.com,
        qais.yousef@....com, hannes@...xchg.org, klimov.linux@...il.com
Subject: Re: [RFC][PATCH] cpu/hotplug: wait for cpuset_hotplug_work to
 finish on cpu online

Daniel Jordan <daniel.m.jordan@...cle.com> writes:
> Peter Zijlstra <peterz@...radead.org> writes:
>>> The nature of this bug is also described here (with different consequences):
>>> https://lore.kernel.org/lkml/20200211141554.24181-1-qais.yousef@arm.com/
>>
>> Yeah, pesky deadlocks.. someone was going to try again.
>
> I dug up the synchronous patch
>
>     https://lore.kernel.org/lkml/1579878449-10164-1-git-send-email-prsood@codeaurora.org/
>
> but surprisingly wasn't able to reproduce the lockdep splat from
>
>     https://lore.kernel.org/lkml/F0388D99-84D7-453B-9B6B-EEFF0E7BE4CC@lca.pw/
>
> even though I could hit it a few weeks ago.

oh okay, you need to mount a legacy cpuset hierarchy.

So as the above splat shows, making cpuset_hotplug_workfn() synchronous
means cpu_hotplug_lock (and "cpuhp_state-down") can be acquired before
cgroup_mutex.

But there are at least four cgroup paths that take the locks in the
opposite order.  They're all the same, they take cgroup_mutex and then
cpu_hotplug_lock later on to modify one or more static keys.

cpu_hotplug_lock should probably be ahead of cgroup_mutex because the
latter is taken in a hotplug callback, and we should keep the static
branches in cgroup, so the only way out I can think of is moving
cpu_hotplug_lock to just before cgroup_mutex is taken and switching to
_cpuslocked flavors of the static key calls.

lockdep quiets down with that change everywhere, but it puts another big
lock around a lot of cgroup paths.  Seems less heavyhanded to go with
this RFC.  What do you all think?

Absent further discussion, Alexey, do you plan to post another version?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ