[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1455037444.3604.3.camel@gmail.com>
Date: Tue, 09 Feb 2016 18:04:04 +0100
From: Mike Galbraith <umgwanakikbuti@...il.com>
To: Tejun Heo <tj@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Michal Hocko <mhocko@...nel.org>, Jiri Slaby <jslaby@...e.cz>,
Thomas Gleixner <tglx@...utronix.de>,
Petr Mladek <pmladek@...e.com>, Jan Kara <jack@...e.cz>,
Ben Hutchings <ben@...adent.org.uk>,
Sasha Levin <sasha.levin@...cle.com>, Shaohua Li <shli@...com>,
LKML <linux-kernel@...r.kernel.org>,
stable <stable@...r.kernel.org>,
Daniel Bilik <daniel.bilik@...system.cz>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: Crashes with 874bbfe600a6 in 3.18.25
On Tue, 2016-02-09 at 11:50 -0500, Tejun Heo wrote:
> Hello,
>
> On Tue, Feb 09, 2016 at 08:39:15AM -0800, Linus Torvalds wrote:
> > > A niggling question remaining is when is it gonna be killed?
> >
> > It probably should be killed sooner rather than later.
> >
> > Just document that if you need something to run on a _particular_
> > cpu,
> > you need to use "schedule_delayed_work_on()" and "add_timer_on()".
>
> I'll queue a patch to put unbound work items on foreign cpus (maybe
> every Nth to reduce perf impact). Wanted to align it to rc1 and then
> let it get tested during the devel cycle but missed this window. It's
> a bit late in devel cycle but we can still do it in this cycle.
Or do something like the below, and get guinea pigs for free.
workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs
WORK_CPU_UNBOUND work items queued to a bound workqueue always run
locally. This is a good thing normally, but not when the user has
asked us to keep unbound work away from certain CPUs. Round robin
these to wq_unbound_cpumask CPUs instead, as perturbation avoidance
trumps performance.
Signed-off-by: Mike Galbraith <umgwanakikbuti@...il.com>
---
kernel/workqueue.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -303,6 +303,9 @@ static bool workqueue_freezing; /* PL:
static cpumask_var_t wq_unbound_cpumask; /* PL: low level cpumask for all unbound wqs */
+/* CPU where WORK_CPU_UNBOUND work was last round robin scheduled from this CPU */
+static DEFINE_PER_CPU(unsigned int, wq_unbound_rr_cpu_last);
+
/* the per-cpu worker pools */
static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
cpu_worker_pools);
@@ -1298,6 +1301,28 @@ static bool is_chained_work(struct workq
return worker && worker->current_pwq->wq == wq;
}
+/*
+ * When queueing WORK_CPU_UNBOUND work to a !WQ_UNBOUND queue, round
+ * robin among wq_unbound_cpumask to avoid perturbing sensitive tasks.
+ */
+static unsigned int select_round_robin_cpu(unsigned int cpu)
+{
+ int new_cpu;
+
+ if (cpumask_test_cpu(cpu, wq_unbound_cpumask))
+ return cpu;
+ if (cpumask_empty(wq_unbound_cpumask))
+ return cpu;
+ new_cpu = __this_cpu_read(wq_unbound_rr_cpu_last);
+ new_cpu = cpumask_next_and(new_cpu, wq_unbound_cpumask, cpu_online_mask);
+ if (unlikely(new_cpu >= nr_cpu_ids))
+ new_cpu = cpumask_first_and(wq_unbound_cpumask, cpu_online_mask);
+ if (unlikely(WARN_ON_ONCE(new_cpu >= nr_cpu_ids)))
+ return cpu;
+ __this_cpu_write(wq_unbound_rr_cpu_last, new_cpu);
+ return new_cpu;
+}
+
static void __queue_work(int cpu, struct workqueue_struct *wq,
struct work_struct *work)
{
@@ -1323,7 +1348,7 @@ static void __queue_work(int cpu, struct
return;
retry:
if (req_cpu == WORK_CPU_UNBOUND)
- cpu = raw_smp_processor_id();
+ cpu = select_round_robin_cpu(raw_smp_processor_id());
/* pwq which will be used unless @work is executing elsewhere */
if (!(wq->flags & WQ_UNBOUND))
Powered by blists - more mailing lists