lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4074.1175207458@death>
Date:	Thu, 29 Mar 2007 15:30:58 -0700
From:	Jay Vosburgh <fubar@...ibm.com>
To:	"Chris Friesen" <cfriesen@...tel.com>
cc:	Andy Gospodarek <andy@...yhouse.net>, netdev@...r.kernel.org,
	bonding-devel@...ts.sourceforge.net
Subject: Re: [Bonding-devel] quick help with bonding? 


Chris Friesen <cfriesen@...tel.com> wrote:
[...]
>I have a ppc64 blade running a customized 2.6.10.  At init time, two of
>our gigE links (eth4 and eth5) are bonded together to form bond0.  This
>link has an MTU of 9000, and uses arp monitoring.  We're using an ethernet
>driver with a modified RX path for jumbo frames[1].  With the stock
>driver, it seems to work fine.

	2.6.10 is pretty old, and there have been a number of fixes to
the bonding ARP monitor since then, so it may be that it is simply
misbehaving (presuming that you're running the 2.6.10 bonding driver).
Are you in a position to test against a more recent kernel (and/or
bonding driver)?  Does the miimon misbehave in a similar fashion?

>The problem is that eth5 seems to be bouncing up and down every 15 sec or
>so (see the attached log excerpt).  Also, "ifconfig" shows that only 3
>packets totalling 250 bytes have gone out eth5, when I know that the arp
>monitoring code from the bond layer is sending 10 arps/sec out the link.
[...]
>Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth4 to be reset in 30000 msec.
[...]
>Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: cancelled scheduled reset of interface eth5

	These two messages (which appear a number of times in your log
excerpt) are not from the standard mainline bonding driver, even in
2.6.10.  I don't know what this is all about.

>If I boot the system and then log in and manually create the bond link
>(rather than it happening at init time) then I don't see the problem.

	I would hazard to guess that it's an ARP monitor problem; older
versions of the ARP monitor had less than intelligent means to figure
out what the bond's IP address is (to use for the probes).  This, along
with some logic problems in the monitor code itself, led to various
problems with the ARP probes and the sort of "up / down" cycle of
behavior you seem to be seeing.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@...ibm.com
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ