lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 27 May 2019 12:53:39 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Rik van Riel <riel@...riel.com>
Cc:     Andrew Murray <andrew.murray@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] smp,cpumask: Don't call functions on offline CPUs

On Wed, May 22, 2019 at 02:23:47PM -0400, Rik van Riel wrote:
> On Wed, 2019-05-22 at 16:49 +0200, Peter Zijlstra wrote:
> > On Wed, May 22, 2019 at 03:37:11PM +0100, Andrew Murray wrote:
> > > > Is perhaps the problem that on_each_cpu_cond() uses
> > > > cpu_onlne_mask
> > > > without protection?
> > > 
> > > Does this prevent racing with a CPU going offline? I guess this
> > > prevents
> > > the warning at the expense of a lock - but is only beneficial in
> > > the
> > > unlikely path. (In the likely path this prevents new CPUs going
> > > offline
> > > but we don't care because we don't WARN if they aren't they when we
> > > attempt to call functions).
> > > 
> > > At least this is my limited understanding.
> > 
> > Hmm.. I don't think it could matter, we only use the mask when
> > preempt_disable(), which would already block offline, due to it using
> > stop_machine().
> > 
> > So the patch is a no-op.
> > 
> > What's the WARN you see? TLB invalidation should pass mm_cpumask(),
> > which similarly should not contain offline CPUs I'm thinking.
> 
> Does the TLB invalidation code have anything in it
> to prevent from racing with the CPU offline code?
> 
> In other words, could we end up with the TLB
> invalidation code building its bitmask, getting
> interrupted (eg. hypervisor preemption, NMI),
> and not sending out the IPI to that bitmask of
> CPUs until after one of the CPUs in the bitmap
> has gotten offlined?

One possible thing would be if cpu-offline didn't remove the bit from
mm_cpumask() because it entered lazy state earlier or something.

Then, mm_cpumask() would contain an offline CPU and the WARN could
trigger, but I don't _think_ we do that, but I didn't check.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ