[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1454705231.3819.151.camel@gmail.com>
Date: Fri, 05 Feb 2016 21:47:11 +0100
From: Mike Galbraith <umgwanakikbuti@...il.com>
To: Tejun Heo <tj@...nel.org>
Cc: Michal Hocko <mhocko@...nel.org>, Jiri Slaby <jslaby@...e.cz>,
Thomas Gleixner <tglx@...utronix.de>,
Petr Mladek <pmladek@...e.com>, Jan Kara <jack@...e.cz>,
Ben Hutchings <ben@...adent.org.uk>,
Sasha Levin <sasha.levin@...cle.com>, Shaohua Li <shli@...com>,
LKML <linux-kernel@...r.kernel.org>, stable@...r.kernel.org,
Daniel Bilik <daniel.bilik@...system.cz>
Subject: Re: Crashes with 874bbfe600a6 in 3.18.25
On Fri, 2016-02-05 at 11:49 -0500, Tejun Heo wrote:
> Hello, Mike.
>
> On Thu, Feb 04, 2016 at 03:00:17AM +0100, Mike Galbraith wrote:
> > Isn't it the case that, currently at least, each and every spot that
> > requires execution on a specific CPU yet does not take active measures
> > to deal with hotplug events is in fact buggy? The timer code clearly
> > states that the user is responsible, and so do both workqueue.[ch].
>
> Yeah, the usages which require affinity for correctness must flush the
> work items from a cpu down callback.
Good, we agree. Now bear with me a moment..
That very point is what makes it wrong for the workqueue code to ever
target a work item. The instant it does target selection, correctness
may be at stake, it doesn't know, thus it must assume the full onus,
which it has neither the knowledge not the time to do. That's how we
exploded on node = -1, trying to help out the user by doing his job,
but then not doing the whole job. IMHO, a better plan is to let the
user screw it up all by himself.
-Mike
Powered by blists - more mailing lists