[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180215161349.GA6956@lerouge>
Date: Thu, 15 Feb 2018 17:13:52 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
Levin Alexander <alexander.levin@...izon.com>,
Peter Zijlstra <peterz@...radead.org>,
Mauro Carvalho Chehab <mchehab@...pensource.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
"Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
Wanpeng Li <wanpeng.li@...mail.com>,
Dmitry Safonov <dima@...sta.com>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Paolo Abeni <pabeni@...hat.com>,
Radu Rendec <rrendec@...sta.com>,
Ingo Molnar <mingo@...nel.org>,
Stanislaw Gruszka <sgruszka@...hat.com>,
Rik van Riel <riel@...hat.com>,
Eric Dumazet <edumazet@...gle.com>,
David Miller <davem@...emloft.net>
Subject: Re: [RFC PATCH 2/4] softirq: Per vector deferment to workqueue
On Thu, Feb 08, 2018 at 06:44:52PM +0100, Sebastian Andrzej Siewior wrote:
> On 2018-01-19 16:46:12 [+0100], Frederic Weisbecker wrote:
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index c8c6841..becb1d9 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -62,6 +62,19 @@ const char * const softirq_to_name[NR_SOFTIRQS] = {
> …
> > +static void vector_work_func(struct work_struct *work)
> > +{
> > + struct vector *vector = container_of(work, struct vector, work);
> > + struct softirq *softirq = this_cpu_ptr(&softirq_cpu);
> > + int vec_nr = vector->nr;
> > + int vec_bit = BIT(vec_nr);
> > + u32 pending;
> > +
> > + local_irq_disable();
> > + pending = local_softirq_pending();
> > + account_irq_enter_time(current);
> > + __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET);
> > + lockdep_softirq_enter();
> > + set_softirq_pending(pending & ~vec_bit);
> > + local_irq_enable();
> > +
> > + if (pending & vec_bit) {
> > + struct softirq_action *sa = &softirq_vec[vec_nr];
> > +
> > + kstat_incr_softirqs_this_cpu(vec_nr);
> > + softirq->work_running = 1;
> > + trace_softirq_entry(vec_nr);
> > + sa->action(sa);
>
> You invoke the softirq handler while BH is disabled (not wrong, I just
> state the obvious). That means, the scheduler can't preempt/interrupt
> the workqueue/BH-handler while it is invoked so it has to wait until it
> completes its doing.
> In do_softirq_workqueue() you schedule multiple workqueue items (one for
> each softirq vector) which is unnecessary because they can't preempt one
> another and should be invoked the order they were enqueued. So it would
> be enough to enqueue one item because it is serialized after all. So one
> work_struct per CPU with a cond_resched_rcu_qs() while switching from one
> vector to another should accomplish that what you have now here (not
> sure if that cond_resched after each vector is needed). But…
Makes sense.
>
> > + trace_softirq_exit(vec_nr);
> > + softirq->work_running = 0;
> > + }
> > +
> > + local_irq_disable();
> > +
> > + pending = local_softirq_pending();
> > + if (pending & vec_bit)
> > + schedule_work_on(smp_processor_id(), &vector->work);
>
> … on a system that is using system_wq a lot, it might introduced a certain
> latency until your softirq-worker gets its turn. The workqueue will
> spawn new workers if the current worker schedules out but until that
> happens you have to wait. I am not sure if this is intended or whether
> this might be a problem. I think you could argue either way depending on
> what you currently think is more important.
Indeed :)
> Further, schedule_work_on(x, ) does not guarentee that the work item is
> invoked on CPU x. It tries that but if CPU x goes down due to
> CPU-hotplug then the workitem will be moved to random CPU. For that
> reason we have work_on_cpu_safe() but you don't want to use that / flush
> that workqueue while in here.
Yeah, someone also reported me that hotplug issue. I didn't think workqueue
would break the affinity but here it does. So we would need a hotplug hook
indeed.
>
> May I instead suggest to stick to ksoftirqd? So you run in softirq
> context (after return from IRQ) and if takes too long, you offload the
> vector to ksoftirqd instead. You may want to play with the metric on
> which you decide when you want switch to ksoftirqd / account how long a
> vector runs.
Yeah that makes sense. These workqueues are too much headaches eventually.
I'm going to try that ksoftirqd thing.
Thanks.
Powered by blists - more mailing lists