[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53F6F14B.1030609@redhat.com>
Date: Fri, 22 Aug 2014 15:29:15 +0800
From: Jason Wang <jasowang@...hat.com>
To: Mike Galbraith <umgwanakikbuti@...il.com>
CC: davem@...emloft.net, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, mst@...hat.com,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: [PATCH net-next 2/2] net: exit busy loop when another process
is runnable
On 08/22/2014 01:01 PM, Mike Galbraith wrote:
> On Thu, 2014-08-21 at 16:05 +0800, Jason Wang wrote:
>> > Rx busy loop does not scale well in the case when several parallel
>> > sessions is active. This is because we keep looping even if there's
>> > another process is runnable. For example, if that process is about to
>> > send packet, keep busy polling in current process will brings extra
>> > delay and damage the performance.
>> >
>> > This patch solves this issue by exiting the busy loop when there's
>> > another process is runnable in current cpu. Simple test that pin two
>> > netperf sessions in the same cpu in receiving side shows obvious
>> > improvement:
> That patch says to me it's a bad idea to spin when someone (anyone) else
> can get some work done on a CPU, which intuitively makes sense. But..
>
> (ponders net goop: with silly 1 byte ping-pong load, throughput is bound
> by fastpath latency, net plus sched plus fixable nohz and governor crud
> if not polling, so you can't get a lot of data moved byte at a time no
> matter how sexy the pipe whether polling or not due to bound. If OTOH
> net hardware is a blazing fast large bore packet cannon, net overhead
> per unit payload drops, sched+crud is a constant)
Polling could be done by either rx busy loop in process context or NAPI
in softirq. Rx busy loop may only spin and poll when no packet were
found in socket receive queue. It spins in the hope that at least one
packet will come (in this case the process will exit rx busy loop) in a
short while. In this way, it eliminates the overheads of NAPI, wakeup
and scheduling. This patch just make the busy polling less aggressive:
Since the process finds nothing to receive when still spinning in this
loop, there's no need to waste cpu cycles ( or even call cpu_relax()) if
there's another work could be done by current CPU.
For stream workload like you mentioned here, if the card was fast
enough, the socket receive queue was not easy to be drained. Rx busy
loop won't help or even won't be triggered in this case.
>
> Seems the only time it's a good idea to poll is if blasting big packets
> on sexy hardware, and if you're doing that, you want to poll regardless
> of whether somebody else is waiting, or?
NAPI will work instead of rx busy loop in this case. It will poll and
try to drain nic's rx ring in softirq regardless somebody else.
Btw, current rx busy loop does not perform well on stream workload since
it bypasses GRO to reduce latency. But this issue beyond the scope of
this patch.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists