[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1463684190.18194.228.camel@edumazet-glaptop3.roam.corp.google.com>
Date: Thu, 19 May 2016 11:56:30 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: netdev <netdev@...r.kernel.org>,
Alexander Duyck <aduyck@...antis.com>
Subject: Re: [RFC] net: remove busylock
On Thu, 2016-05-19 at 11:03 -0700, Alexander Duyck wrote:
> On Thu, May 19, 2016 at 10:08 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> > busylock was added at the time we had expensive ticket spinlocks
> >
> > (commit 79640a4ca6955e3ebdb7038508fa7a0cd7fa5527 ("net: add additional
> > lock to qdisc to increase throughput")
> >
> > Now kernel spinlocks are MCS, this busylock things is no longer
> > relevant. It is slowing down things a bit.
> >
> >
> > With HTB qdisc, here are the numbers for 200 concurrent TCP_RR, on a host with 48 hyperthreads.
> >
> > lpaa5:~# sar -n DEV 4 4 |grep eth0
> > 10:05:44 eth0 798951.25 798951.75 52276.22 52275.26 0.00 0.00 0.50
> > 10:05:48 eth0 798576.00 798572.75 52251.24 52250.39 0.00 0.00 0.75
> > 10:05:52 eth0 798746.00 798748.75 52262.89 52262.13 0.00 0.00 0.50
> > 10:05:56 eth0 798303.25 798291.50 52235.22 52233.10 0.00 0.00 0.50
> > Average: eth0 798644.12 798641.19 52256.39 52255.22 0.00 0.00 0.56
> >
> > Disabling busylock (by using a local sysctl)
> >
> > lpaa5:~# sar -n DEV 4 4 |grep eth0
> > 10:05:14 eth0 864085.75 864091.50 56538.09 56537.46 0.00 0.00 0.50
> > 10:05:18 eth0 864734.75 864729.25 56580.35 56579.05 0.00 0.00 0.75
> > 10:05:22 eth0 864366.00 864361.50 56556.74 56555.00 0.00 0.00 0.50
> > 10:05:26 eth0 864246.50 864248.75 56549.19 56547.65 0.00 0.00 0.50
> > Average: eth0 864358.25 864357.75 56556.09 56554.79 0.00 0.00 0.56
> >
> > That would be a 8 % increase.
>
> The main point of the busy lock is to deal with the bulk throughput
> case, not the latency case which would be relatively well behaved.
> The problem wasn't really related to lock bouncing slowing things
> down. It was the fairness between the threads that was killing us
> because the dequeue needs to have priority.
>
> The main problem that the busy lock solved was the fact that you could
> start a number of stream tests equal to the number of CPUs in a given
> system and the result was that the performance would drop off a cliff
> and you would drop almost all the packets for almost all the streams
> because the qdisc never had a chance to drain because it would be CPU
> - 1 enqueues, followed by 1 dequeue.
>
> What we need if we are going to get rid of busy lock would be some
> sort of priority locking mechanism that would allow the dequeue thread
> to jump to the head of the line if it is attempting to take the lock.
> Otherwise you end up spending all your time enqueuing packets into
> oblivion because the qdiscs just overflow without the busy lock in
> place.
Removing busylock helped in all cases I tested. (at least on x86 as
David pointed out)
As I said, we need to revisit busylock now that spinlocks are different.
In one case (20 concurrent UDP netperf), I even got a 500 % increase.
With busylock :
lpaa5:~# sar -n DEV 4 4|grep eth0
11:33:34 eth0 9.00 115057.00 1.60 38426.92 0.00 0.00 0.50
11:33:38 eth0 13.50 113237.75 2.04 37819.69 0.00 0.00 0.75
11:33:42 eth0 13.50 111492.25 1.76 37236.58 0.00 0.00 0.75
11:33:46 eth0 12.75 111401.50 2.40 37205.93 0.00 0.00 0.75
Average: eth0 12.19 112797.12 1.95 37672.28 0.00 0.00 0.69
Packets are dropped in HTB because we hit a limit of 1000 packets there
- 100.00% netperf [kernel.kallsyms] [k] kfree_skb ▒
- kfree_skb ▒
- 100.00% htb_enqueue ▒
__dev_queue_xmit ▒
dev_queue_xmit ▒
ip_finish_output2 ▒
ip_finish_output ▒
ip_output ▒
ip_local_out ▒
+ ip_send_skb
Presumably it would tremendously help if the actual kfree_skb()
was done after qdisc lock is released, ie not from the qdisc->enqueue()
method.
Without busylock :
lpaa5:~# sar -n DEV 4 4|grep eth0
11:41:12 eth0 11.00 669053.50 1.99 223452.30 0.00 0.00 0.75
11:41:16 eth0 8.50 669513.25 2.27 223605.55 0.00 0.00 0.75
11:41:20 eth0 3.50 669426.50 0.90 223577.19 0.00 0.00 0.50
11:41:24 eth0 8.25 669284.00 1.42 223529.79 0.00 0.00 0.50
Average: eth0 7.81 669319.31 1.65 223541.21 0.00 0.00 0.62
Powered by blists - more mailing lists