[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFBcO+_PoXhbq+p-2z=acCpboJtOewXkp-9-3=csafoAYuNFQw@mail.gmail.com>
Date: Tue, 19 Jan 2021 16:52:55 +0000
From: Alexey Klimov <aklimov@...hat.com>
To: Daniel Jordan <daniel.m.jordan@...cle.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
yury.norov@...il.com, tglx@...utronix.de, jobaker@...hat.com,
audralmitchel@...il.com, arnd@...db.de, gregkh@...uxfoundation.org,
rafael@...nel.org, tj@...nel.org, lizefan.x@...edance.com,
qais.yousef@....com, hannes@...xchg.org,
Alexey Klimov <klimov.linux@...il.com>
Subject: Re: [RFC][PATCH] cpu/hotplug: wait for cpuset_hotplug_work to finish
on cpu online
On Fri, Jan 15, 2021 at 6:54 AM Daniel Jordan
<daniel.m.jordan@...cle.com> wrote:
>
> Daniel Jordan <daniel.m.jordan@...cle.com> writes:
> > Peter Zijlstra <peterz@...radead.org> writes:
> >>> The nature of this bug is also described here (with different consequences):
> >>> https://lore.kernel.org/lkml/20200211141554.24181-1-qais.yousef@arm.com/
> >>
> >> Yeah, pesky deadlocks.. someone was going to try again.
> >
> > I dug up the synchronous patch
> >
> > https://lore.kernel.org/lkml/1579878449-10164-1-git-send-email-prsood@codeaurora.org/
> >
> > but surprisingly wasn't able to reproduce the lockdep splat from
> >
> > https://lore.kernel.org/lkml/F0388D99-84D7-453B-9B6B-EEFF0E7BE4CC@lca.pw/
> >
> > even though I could hit it a few weeks ago.
>
> oh okay, you need to mount a legacy cpuset hierarchy.
>
> So as the above splat shows, making cpuset_hotplug_workfn() synchronous
> means cpu_hotplug_lock (and "cpuhp_state-down") can be acquired before
> cgroup_mutex.
>
> But there are at least four cgroup paths that take the locks in the
> opposite order. They're all the same, they take cgroup_mutex and then
> cpu_hotplug_lock later on to modify one or more static keys.
>
> cpu_hotplug_lock should probably be ahead of cgroup_mutex because the
> latter is taken in a hotplug callback, and we should keep the static
> branches in cgroup, so the only way out I can think of is moving
> cpu_hotplug_lock to just before cgroup_mutex is taken and switching to
> _cpuslocked flavors of the static key calls.
>
> lockdep quiets down with that change everywhere, but it puts another big
> lock around a lot of cgroup paths. Seems less heavyhanded to go with
> this RFC. What do you all think?
Daniel, thank you for taking a look. I don't mind reviewing+testing
another approach that you described.
> Absent further discussion, Alexey, do you plan to post another version?
I plan to update this patch and re-send in the next couple of days. It
looks like it might be a series of two patches. Sorry for delays.
Best regards,
Alexey
Powered by blists - more mailing lists