lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 29 Mar 2007 15:30:58 -0700 From: Jay Vosburgh <fubar@...ibm.com> To: "Chris Friesen" <cfriesen@...tel.com> cc: Andy Gospodarek <andy@...yhouse.net>, netdev@...r.kernel.org, bonding-devel@...ts.sourceforge.net Subject: Re: [Bonding-devel] quick help with bonding? Chris Friesen <cfriesen@...tel.com> wrote: [...] >I have a ppc64 blade running a customized 2.6.10. At init time, two of >our gigE links (eth4 and eth5) are bonded together to form bond0. This >link has an MTU of 9000, and uses arp monitoring. We're using an ethernet >driver with a modified RX path for jumbo frames[1]. With the stock >driver, it seems to work fine. 2.6.10 is pretty old, and there have been a number of fixes to the bonding ARP monitor since then, so it may be that it is simply misbehaving (presuming that you're running the 2.6.10 bonding driver). Are you in a position to test against a more recent kernel (and/or bonding driver)? Does the miimon misbehave in a similar fashion? >The problem is that eth5 seems to be bouncing up and down every 15 sec or >so (see the attached log excerpt). Also, "ifconfig" shows that only 3 >packets totalling 250 bytes have gone out eth5, when I know that the arp >monitoring code from the bond layer is sending 10 arps/sec out the link. [...] >Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: scheduling interface eth4 to be reset in 30000 msec. [...] >Mar 29 20:54:08 base0-0-0-5-0-11-1 kernel: bonding: bond0: cancelled scheduled reset of interface eth5 These two messages (which appear a number of times in your log excerpt) are not from the standard mainline bonding driver, even in 2.6.10. I don't know what this is all about. >If I boot the system and then log in and manually create the bond link >(rather than it happening at init time) then I don't see the problem. I would hazard to guess that it's an ARP monitor problem; older versions of the ARP monitor had less than intelligent means to figure out what the bond's IP address is (to use for the probes). This, along with some logic problems in the monitor code itself, led to various problems with the ARP probes and the sort of "up / down" cycle of behavior you seem to be seeing. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@...ibm.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists