[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1515681091.3039.21.camel@arista.com>
Date: Thu, 11 Jan 2018 14:31:31 +0000
From: Dmitry Safonov <dima@...sta.com>
To: Frederic Weisbecker <frederic@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Eric Dumazet <edumazet@...gle.com>,
LKML <linux-kernel@...r.kernel.org>,
Dmitry Safonov <0x7f454c46@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
David Miller <davem@...emloft.net>,
Frederic Weisbecker <fweisbec@...il.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Ingo Molnar <mingo@...nel.org>,
"Levin, Alexander (Sasha Levin)" <alexander.levin@...izon.com>,
Paolo Abeni <pabeni@...hat.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Radu Rendec <rrendec@...sta.com>,
Rik van Riel <riel@...hat.com>,
Stanislaw Gruszka <sgruszka@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Wanpeng Li <wanpeng.li@...mail.com>
Subject: Re: [RFC 1/2] softirq: Defer net rx/tx processing to ksoftirqd
context
On Thu, 2018-01-11 at 05:44 +0100, Frederic Weisbecker wrote:
> On Wed, Jan 10, 2018 at 08:19:49PM -0800, Linus Torvalds wrote:
> > On Wed, Jan 10, 2018 at 7:22 PM, Frederic Weisbecker
> > <frederic@...nel.org> wrote:
> > >
> > > Makes sense, but I think you need to keep the TASK_RUNNING check.
> >
> > Yes, good point.
> >
> > > So perhaps it should be:
> > >
> > > - return tsk && (tsk->state == TASK_RUNNING);
> > > + return (tsk == current) && (tsk->state == TASK_RUNNING);
> >
> > Looks good to me - definitely worth trying.
> >
> > Maybe that weakens the thing so much that it doesn't actually help
> > the
> > UDP packet storm case?
> >
> > And maybe it's not sufficient for the dvb issue.
> >
> > But I think it's worth at least testing. Maybe it makes neither
> > side
> > entirely happy, but maybe it might be a good halfway point?
>
> Yes I believe Dmitry is facing a different problem where he would
> rather
> see ksoftirqd scheduled more often to handle the queue as a deferred
> batch
> instead of having it served one by one on the tails of IRQ storms.
> (Dmitry correct me if I misunderstood).
Quite so, what I see is that ksoftirqd is rarely (close to never)
scheduled in case of UDP packet storm. That's because the up coming irq
is too late in __do_softirq().
So, there is no wakeup on UDP storm here:
: pending = local_softirq_pending();
: if (pending & mask) {
: if (time_before(jiffies, end) && !need_resched() &&
: --max_restart)
: goto restart;
:
: wakeup_softirqd();
: }
(as there is yet no pending softirq). It comes a bit late to schedule
ksoftirqd and in result the next softirq is processed on the context of
the task again, not in the scheduled ksoftirqd.
That results in cpu-time starvation for the process on irq storm.
While I saw that on out-of-tree driver, I believe that on some
frequencies (lower than storm) one can observe the same on mainstream
drivers. And I *think* that I've reproduced that on mainstream with
virtio driver and package size of 1500 in VMs (thou I don't quite like
the perf testing in VMs).
So, ITOW, maybe there is a bit better way to *detect* that cpu time
spent on serving softirqs is close to storm and that userspace starts
starving? (and launch ksoftirqd in the result or balance between
deferring and serving softirq right-there).
> But your patch still seems to make sense for the case you described:
> when
> ksoftirqd is voluntarily preempted off and the current IRQ could
> handle the
> queue.
Powered by blists - more mailing lists