[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49A972B3.8020309@krogh.cc>
Date: Sat, 28 Feb 2009 18:21:55 +0100
From: Jesper Krogh <jesper@...gh.cc>
To: Jay Vosburgh <fubar@...ibm.com>
CC: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Jeff Garzik <jgarzik@...hat.com>, aowi@...ozymes.com
Subject: Re: Regression in bonding between 2.6.26.8 and 2.6.27.6 - bisected
Jay Vosburgh wrote:
> Jesper Krogh <jesper@...gh.cc> wrote:
>
>> Jay Vosburgh wrote:
>>> Jesper Krogh <jesper@...gh.cc> wrote:
>>> [...]
>>>> The offending commit seems to be:
>>>>
>>>> A test with a fresh 2.6.29-rc6 revealed that the problem has been fixed
>>>> subsequently.. but still exists in 2.6.27-newest. (havent tested
>>>> 2.6.28-newest yet).
>>>>
>>>> Any ideas of what the "fixing" commit is .. or should that also be
>>>> bisected?
>>> I went back and looked at your earlier mail. Since you're using
>>> 802.3ad mode, my first guess would be this commit:
>>>
>>> commit fd989c83325cb34795bc4d4aa6b13c06f90eac99
>>> Author: Jay Vosburgh <fubar@...ibm.com>
>>> Date: Tue Nov 4 17:51:16 2008 -0800
>>>
>>> bonding: alternate agg selection policies for 802.3ad
>> That didn't do it.. I applied it to 2.6.27.19 but it didnt make that work.
>> dmesg | grep bond (2.6.27.19 + above patch).
>
> That was the only real functional change to 802.3ad, there are a
> lot of other commits, but they're all style or cleanup sorts of things.
>
>> [ 13.643301] bonding: MII link monitoring set to 100 ms
>> [ 13.730455] bonding: bond0: enslaving eth0 as a backup interface with
>> an up link.
>> [ 13.781934] bonding: bond0: enslaving eth1 as a backup interface with
>> an up link.
>> [ 13.904665] bonding: bond0: enslaving eth2 as a backup interface with a
>> down link.
>> [ 16.945264] bonding: bond0: link status definitely up for interface eth2.
>> [ 75.040290] bond0: no IPv6 routers present
>>
>> dmesg | grep bond (2.6.29-rc6)
>>
>> $ ssh quad02 dmesg | grep bond
>> [ 27.437877] bonding: MII link monitoring set to 100 ms
>> [ 27.445246] ADDRCONF(NETDEV_UP): bond0: link is not ready
>> [ 27.493260] bonding: bond0: enslaving eth0 as a backup interface with a
>> down link.
>> [ 27.521397] bonding: bond0: enslaving eth1 as a backup interface with a
>> down link.
>> [ 27.542332] bonding: bond0: Warning: No 802.3ad response from the link
>> partner for any adapters in the bond
>> [ 27.611509] bonding: bond0: enslaving eth2 as a backup interface with a
>> down link.
>> [ 27.617017] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>> [ 27.642330] bonding: bond0: Warning: No 802.3ad response from the link
>> partner for any adapters in the bond
>> [ 30.042501] bonding: bond0: link status definitely up for interface eth1.
>> [ 30.142505] bonding: bond0: link status definitely up for interface eth0.
>> [ 30.742547] bonding: bond0: link status definitely up for interface eth2.
>> [ 37.875044] bond0: no IPv6 routers present
>>
>> I just tested 2.6.28.7.. it still broken. So the fix probably has to be
>> somewhere in the post 2.6.28 sets.
>
> It looks like the above two tests are on different machines, or
> were at least done with different network cards. Is that the case?
There is 12 Sun Fire X2200 in the rack, they are fully identical (some
with a small difference in memory configuration as the only difference.
So yes, different machines, but same hardware (bought in the same
shipment, etc. etc).
> I'm just wondering if what you're seeing is somehow tied to the
> network devices' respective autonegotiation speeds, or some difference
> in the device drivers. The first dmesg looks to have one slow (3 sec)
> and two fast ones; the second dmesg looks to have all slow devices.
>
> Have you tried the kernels the other way around (the first
> kernel on the second machine, and vice versa)?
Yes, I've randomly picked a machine in the set to do the test, they all
falls out as "predicted".
> I'll compile 2.6.28.7 here and see if it works for me.
Jesper
--
Jesper
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists