lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 15 Jun 2018 09:39:51 +0100
From:   Quentin Perret <quentin.perret@....com>
To:     Juri Lelli <juri.lelli@...hat.com>
Cc:     Steven Rostedt <rostedt@...dmis.org>, peterz@...radead.org,
        mingo@...hat.com, linux-kernel@...r.kernel.org,
        luca.abeni@...tannapisa.it, claudio@...dence.eu.com,
        tommaso.cucinotta@...tannapisa.it, bristot@...hat.com,
        mathieu.poirier@...aro.org, lizefan@...wei.com,
        cgroups@...r.kernel.org
Subject: Re: [PATCH v4 1/5] sched/topology: Add check to backup comment about
 hotplug lock

On Thursday 14 Jun 2018 at 16:30:37 (+0200), Juri Lelli wrote:
> On 14/06/18 15:18, Quentin Perret wrote:
> > On Thursday 14 Jun 2018 at 16:11:18 (+0200), Juri Lelli wrote:
> > > On 14/06/18 14:58, Quentin Perret wrote:
> > > 
> > > [...]
> > > 
> > > > Hmm not sure if this can help but I think that rebuild_sched_domains()
> > > > does _not_ take the hotplug lock before calling partition_sched_domains()
> > > > when CONFIG_CPUSETS=n. But it does take it for CONFIG_CPUSETS=y.
> > > 
> > > Did you mean cpuset_mutex?
> > 
> > Nope, I really meant the cpu_hotplug_lock !
> > 
> > With CONFIG_CPUSETS=n, rebuild_sched_domains() calls
> > partition_sched_domains() directly:
> > 
> > https://elixir.bootlin.com/linux/latest/source/include/linux/cpuset.h#L255
> > 
> > But with CONFIG_CPUSETS=y, rebuild_sched_domains() calls,
> > rebuild_sched_domains_locked(), which calls get_online_cpus() which
> > calls cpus_read_lock(), which does percpu_down_read(&cpu_hotplug_lock).
> > And all that happens before calling partition_sched_domains().
> 
> Ah, right!
>  
> > So yeah, the point I was trying to make is that there is an inconsistency
> > here, maybe for a good reason ? Maybe related to the issue you're seeing ?
> 
> The config that came with the 0day splat was indeed CONFIG_CPUSETS=n.
> 
> So, in this case IIUC we hit the !doms_new branch of partition_sched_
> domains, which uses cpu_active_mask (and cpu_possible_mask indirectly).
> Should this be still protected by the hotplug lock then?

Hmm I'm not sure ... But looking at your call trace, it seems that the
issue happens when sched_cpu_deactivate() is called (not sure why this
is called during boot BTW ?), which calls cpuset_update_active_cpus().

And again, for CONFIG_CPUSETS=n, that defaults to a raw call to
partition_sched_domain(), but with ndoms_new=1, and no lock taken.
I'm still not sure if this is done like that for a good reason, or if
this is actually an issue that this patch caught nicely ...

Quentin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ