[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160714234207.GA93671@ast-mbp.thefacebook.com>
Date: Thu, 14 Jul 2016 16:42:08 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: John Fastabend <john.fastabend@...il.com>
Cc: fw@...len.de, jhs@...atatu.com, eric.dumazet@...il.com,
brouer@...hat.com, netdev@...r.kernel.org
Subject: Re: [RFC PATCH v2 08/10] net: sched: pfifo_fast use alf_queue
On Wed, Jul 13, 2016 at 11:23:12PM -0700, John Fastabend wrote:
> This converts the pfifo_fast qdisc to use the alf_queue enqueue and
> dequeue routines then sets the NOLOCK bit.
>
> This also removes the logic used to pick the next band to dequeue from
> and instead just checks each alf_queue for packets from top priority
> to lowest. This might need to be a bit more clever but seems to work
> for now.
>
> Signed-off-by: John Fastabend <john.r.fastabend@...el.com>
> ---
> net/sched/sch_generic.c | 131 +++++++++++++++++++++++++++--------------------
> static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc,
> struct sk_buff **to_free)
> {
> - return qdisc_drop(skb, qdisc, to_free);
> + err = skb_array_produce_bh(q, skb);
..
> static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
> {
> + skb = skb_array_consume_bh(q);
For this particular qdisc the performance gain should come from
granularityof spin_lock, right?
Before we were taking the lock much earlier. Here we keep the lock,
but for the very short time.
original pps lockless diff
1 1418168 1269450 -148718
2 1587390 1553408 -33982
4 1084961 1683639 +598678
8 989636 1522723 +533087
12 1014018 1348172 +334154
so perf for 1 cpu case is mainly due to array vs list,
since number of locks is still the same and there is no collision ?
but then why shorter lock give higher overhead in multi cpu cases?
That would have been the main target for performance improvement?
Looks like mq gets the most benefit, because it's lockless internally
which makes sense.
In general I think this is the right direction where tc infra should move to.
I'm only not sure whether it's worth converting pfifo to skb_array.
Probably alf queue would have been a better alternative.
Powered by blists - more mailing lists