[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1339668471.22704.714.camel@edumazet-glaptop>
Date: Thu, 14 Jun 2012 12:07:51 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>
Cc: jhautbois@...il.com, netdev@...r.kernel.org
Subject: Re: Regression on TX throughput when using bonding
On Thu, 2012-06-14 at 03:00 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Thu, 14 Jun 2012 11:50:17 +0200
>
> > On Thu, 2012-06-14 at 11:22 +0200, Eric Dumazet wrote:
> >
> >> So you are saying that if you make skb_orphan_try() doing nothing, it
> >> solves your problem ?
> >
> > It probably does, if your application does an UDP flood, trying to send
> > more than the link bandwidth. I guess only benchmarks workloads ever try
> > to do that.
>
> Eric, I just want to point out that back when this early orphaning
> idea were being proposed I warned about this, and specifically I
> mentioned that, for datagram sockets, the socket send buffer limits
> are what provide proper rate control and fairness.
If I remember well, the argument was that if workload was using thousand
of sockets, the per socket limitation of in-flight packet would not save
you anyway. We would drop packets.
> It also, therefore, protects the system from one datagram spammer
> being able to essentially take over the network interface and blocking
> out all other users.
>
> Early orphaning breaks this completely.
>
> I guess we decided that moving an atomic operation earlier is worth
> all of this?
It was, but with BQL, we should have far less packets in TX rings, so it
might be different today (on BQL enabled NICS only)
>
> Now we are so addicted to the increased performance from early
> orphaning that I fear we'll never be allowed back into that sane
> state of affairs ever again.
bonding (or other virtual devices) is special in the sense the
dev_hard_start_xmit() is called twice.
We should have a way to properly park packets in Qdiscs, and only do the
orphaning once skb given to real device for 'immediate or so'
transmission.
The pppoe thread is only another manifestation of the same problem.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists