lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD3VwcpLXxk_r+4AX47gKdu=5vg7y9PnEdwUeOpSAhOLncqeeg@mail.gmail.com>
Date:   Thu, 20 Jul 2017 19:07:04 -0700
From:   Benjamin Gilbert <benjamin.gilbert@...eos.com>
To:     netdev@...r.kernel.org
Cc:     maheshb@...gle.com
Subject: Bonding driver fails to enable second interface if updelay is non-zero

[resend]

Hello,

Starting with commit de77ecd4ef02ca783f7762e04e92b3d0964be66b, and
through 4.12.2, the bonding driver in 802.3ad mode fails to enable the
second interface on a bond device if updelay is non-zero.  dmesg says:

[   35.825227] bond0: Setting xmit hash policy to layer3+4 (1)
[   35.825259] bond0: Setting MII monitoring interval to 100
[   35.825303] bond0: Setting down delay to 200
[   35.825328] bond0: Setting up delay to 200
[   35.827414] bond0: Adding slave eth0
[   35.949205] bond0: Enslaving eth0 as a backup interface with a down link
[   35.950812] bond0: Adding slave eth1
[   36.073764] bond0: Enslaving eth1 as a backup interface with a down link
[   36.076808] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
[   39.327423] igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Up 1000
Mbps Full Duplex, Flow Control: RX
[   39.405580] bond0: link status up for interface eth0, enabling it in 0 ms
[   39.405607] bond0: link status definitely up for interface eth0,
1000 Mbps full duplex
[   39.405608] bond0: Warning: No 802.3ad response from the link
partner for any adapters in the bond
[   39.405613] bond0: first active interface up!
[   39.406186] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[   39.551391] igb 0000:01:00.1 eth1: igb: eth1 NIC Link is Up 1000
Mbps Full Duplex, Flow Control: RX
[   39.613590] bond0: link status up for interface eth1, enabling it in 200 ms
[   39.717575] bond0: link status up for interface eth1, enabling it in 200 ms
[   39.821395] bond0: link status up for interface eth1, enabling it in 200 ms
[   39.925584] bond0: link status up for interface eth1, enabling it in 200 ms
[   40.029288] bond0: link status up for interface eth1, enabling it in 200 ms
[   40.133388] bond0: link status up for interface eth1, enabling it in 200 ms

...and so on every 100 ms.  The bug doesn't trigger 100% reliably, but
can be provoked by removing and re-adding interfaces to the bond via
sysfs.

While the problem is occurring, networking appears to be unreliable.
Setting the updelay to 0 fixes it:

[  345.472559] bond0: link status up for interface eth1, enabling it in 200 ms
[  345.576558] bond0: link status up for interface eth1, enabling it in 200 ms
[  345.607614] bond0: Setting up delay to 0
[  345.680396] bond0: link status definitely up for interface eth1,
1000 Mbps full duplex

I'd be happy to provide further details or to test patches.

--Benjamin Gilbert

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ