lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191203174547.GG2889@paulmck-ThinkPad-P72>
Date:   Tue, 3 Dec 2019 09:45:47 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Tejun Heo <tj@...nel.org>, jiangshanlai@...il.com,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Workqueues splat due to ending up on wrong CPU

On Tue, Dec 03, 2019 at 11:00:10AM +0100, Peter Zijlstra wrote:
> On Mon, Dec 02, 2019 at 03:39:44PM -0800, Paul E. McKenney wrote:
> 
> > I think that I do not understand the code, but I never let that stop
> > me from asking stupid questions!  ;-)
> > 
> > Suppose that a given worker is bound to a particular CPU, but has no
> > work pending, and is therefore sleeping in the schedule() call near the
> > end of worker_thread().  During this time, its CPU goes offline and then
> > comes back online.  Doesn't this break that task's affinity to that CPU?
> 
> No. The thing about sleeping tasks is that they're not in fact on any
> CPU at all. Only when a task wakes up do we concern ourselves with
> placing it. If at that time we find the CPU it was constrained to is no
> longer with us, then we go break affinity.
> 
> But if the CPU went away and came back while the task was asleep, it
> will not notice anything.

Good point, and yes, you have told me this before.

Furthermore, in all of these cases, the process was supposed to be
running on CPU 0, which cannot be taken offline on any of the systems
under test.  Which is leading me to wonder if the workqueue CPU-online
notifier is sometimes moving more kthreads to the newly onlined CPU than
it is supposed to.  Tejun, could that be happening?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ