[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180614143037.GH12032@localhost.localdomain>
Date: Thu, 14 Jun 2018 16:30:37 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Quentin Perret <quentin.perret@....com>
Cc: Steven Rostedt <rostedt@...dmis.org>, peterz@...radead.org,
mingo@...hat.com, linux-kernel@...r.kernel.org,
luca.abeni@...tannapisa.it, claudio@...dence.eu.com,
tommaso.cucinotta@...tannapisa.it, bristot@...hat.com,
mathieu.poirier@...aro.org, lizefan@...wei.com,
cgroups@...r.kernel.org
Subject: Re: [PATCH v4 1/5] sched/topology: Add check to backup comment about
hotplug lock
On 14/06/18 15:18, Quentin Perret wrote:
> On Thursday 14 Jun 2018 at 16:11:18 (+0200), Juri Lelli wrote:
> > On 14/06/18 14:58, Quentin Perret wrote:
> >
> > [...]
> >
> > > Hmm not sure if this can help but I think that rebuild_sched_domains()
> > > does _not_ take the hotplug lock before calling partition_sched_domains()
> > > when CONFIG_CPUSETS=n. But it does take it for CONFIG_CPUSETS=y.
> >
> > Did you mean cpuset_mutex?
>
> Nope, I really meant the cpu_hotplug_lock !
>
> With CONFIG_CPUSETS=n, rebuild_sched_domains() calls
> partition_sched_domains() directly:
>
> https://elixir.bootlin.com/linux/latest/source/include/linux/cpuset.h#L255
>
> But with CONFIG_CPUSETS=y, rebuild_sched_domains() calls,
> rebuild_sched_domains_locked(), which calls get_online_cpus() which
> calls cpus_read_lock(), which does percpu_down_read(&cpu_hotplug_lock).
> And all that happens before calling partition_sched_domains().
Ah, right!
> So yeah, the point I was trying to make is that there is an inconsistency
> here, maybe for a good reason ? Maybe related to the issue you're seeing ?
The config that came with the 0day splat was indeed CONFIG_CPUSETS=n.
So, in this case IIUC we hit the !doms_new branch of partition_sched_
domains, which uses cpu_active_mask (and cpu_possible_mask indirectly).
Should this be still protected by the hotplug lock then?
Powered by blists - more mailing lists