lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL8zT=huqtqBKzH3DDwid_C8jH16SH=kjYEK6zjxp_spfnLxXA@mail.gmail.com>
Date:	Thu, 14 Jun 2012 16:14:53 +0200
From:	Jean-Michel Hautbois <jhautbois@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	netdev <netdev@...r.kernel.org>
Subject: Re: Regression on TX throughput when using bonding

2012/6/14 Jean-Michel Hautbois <jhautbois@...il.com>:
> 2012/6/14 Eric Dumazet <eric.dumazet@...il.com>:
>> On Thu, 2012-06-14 at 11:22 +0200, Eric Dumazet wrote:
>>
>>> So you are saying that if you make skb_orphan_try() doing nothing, it
>>> solves your problem ?
>>
>> It probably does, if your application does an UDP flood, trying to send
>> more than the link bandwidth. I guess only benchmarks workloads ever try
>> to do that.
>>
>> bonding has no way to give congestion back, it has no Qdisc by default.
>>
>> We probably can defer the skb_orphan_try() for bonding master, a bit
>> like the IFF_XMIT_DST_RELEASE
>>
>>  drivers/net/bonding/bond_main.c |    2 +-
>>  include/linux/if.h              |    3 +++
>>  net/core/dev.c                  |    5 +++--
>>  3 files changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 2ee8cf9..1b1e9c8 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -4343,7 +4343,7 @@ static void bond_setup(struct net_device *bond_dev)
>>        bond_dev->tx_queue_len = 0;
>>        bond_dev->flags |= IFF_MASTER|IFF_MULTICAST;
>>        bond_dev->priv_flags |= IFF_BONDING;
>> -       bond_dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
>> +       bond_dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING | IFF_XMIT_ORPHAN);
>>
>>        /* At first, we block adding VLANs. That's the only way to
>>         * prevent problems that occur when adding VLANs over an
>> diff --git a/include/linux/if.h b/include/linux/if.h
>> index f995c66..a788e7b 100644
>> --- a/include/linux/if.h
>> +++ b/include/linux/if.h
>> @@ -81,6 +81,9 @@
>>  #define IFF_UNICAST_FLT        0x20000         /* Supports unicast filtering   */
>>  #define IFF_TEAM_PORT  0x40000         /* device used as team port */
>>  #define IFF_SUPP_NOFCS 0x80000         /* device supports sending custom FCS */
>> +#define IFF_XMIT_ORPHAN        0x100000        /* dev_hard_start_xmit() is allowed to
>> +                                        * orphan skb
>> +                                        */
>>
>>
>>  #define IF_GET_IFACE   0x0001          /* for querying only */
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index cd09819..3435463 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -2193,7 +2193,8 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>>                if (!list_empty(&ptype_all))
>>                        dev_queue_xmit_nit(skb, dev);
>>
>> -               skb_orphan_try(skb);
>> +               if (dev->priv_flags & IFF_XMIT_ORPHAN)
>> +                       skb_orphan_try(skb);
>>
>>                features = netif_skb_features(skb);
>>
>> @@ -5929,7 +5930,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
>>        INIT_LIST_HEAD(&dev->napi_list);
>>        INIT_LIST_HEAD(&dev->unreg_list);
>>        INIT_LIST_HEAD(&dev->link_watch_list);
>> -       dev->priv_flags = IFF_XMIT_DST_RELEASE;
>> +       dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_ORPHAN;
>>        setup(dev);
>>
>>        dev->num_tx_queues = txqs;
>>
>>
>
> It works
For your information :
~# tc -s -d qdisc show dev eth1 > before_tc && sleep 10 && tc -s -d
qdisc show dev eth1 > after_tc && ./beforeafter before_tc after_tc
qdisc mq 0: root
 Sent 3185900568 bytes 788681 pkt (dropped 0, overlimits 0 requeues 620)
 backlog 0b 0p requeues 620

As you can see, 2.5Gbps without any difficulties :).

Thanks,
JM
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ