lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 7 Jan 2008 18:57:25 +0100 (CET) From: Krzysztof Oledzki <olel@....pl> To: Andy Gospodarek <andy@...yhouse.net> cc: Jay Vosburgh <fubar@...ibm.com>, Herbert Xu <herbert@...dor.apana.org.au>, Andrew Morton <akpm@...ux-foundation.org>, bugme-daemon@...zilla.kernel.org, shemminger@...ux-foundation.org, davem@...emloft.net, netdev@...r.kernel.org Subject: Re: [Bugme-new] [Bug 9543] New: RTNL: assertion failed at net/ipv6/addrconf.c (2164)/RTNL: assertion failed at net/ipv4/devinet.c (1055) On Wed, 19 Dec 2007, Andy Gospodarek wrote: > On Tue, Dec 18, 2007 at 08:53:39PM +0100, Krzysztof Oledzki wrote: >> >> >> On Fri, 14 Dec 2007, Andy Gospodarek wrote: >> >>> On Fri, Dec 14, 2007 at 07:57:42PM +0100, Krzysztof Oledzki wrote: >>>> >>>> >>>> On Fri, 14 Dec 2007, Andy Gospodarek wrote: >>>> >>>>> On Fri, Dec 14, 2007 at 05:14:57PM +0100, Krzysztof Oledzki wrote: >>>>>> >>>>>> >>>>>> On Wed, 12 Dec 2007, Jay Vosburgh wrote: >>>>>> >>>>>>> Herbert Xu <herbert@...dor.apana.org.au> wrote: >>>>>>> >>>>>>>>> diff -puN drivers/net/bonding/bond_sysfs.c~bonding-locking-fix >>>>>>>>> drivers/net/bonding/bond_sysfs.c >>>>>>>>> --- a/drivers/net/bonding/bond_sysfs.c~bonding-locking-fix >>>>>>>>> +++ a/drivers/net/bonding/bond_sysfs.c >>>>>>>>> @@ -1111,8 +1111,6 @@ static ssize_t bonding_store_primary(str >>>>>>>>> out: >>>>>>>>> write_unlock_bh(&bond->lock); >>>>>>>>> >>>>>>>>> - rtnl_unlock(); >>>>>>>>> - >>>>>>>> >>>>>>>> Looking at the changeset that added this perhaps the intention >>>>>>>> is to hold the lock? If so we should add an rtnl_lock to the start >>>>>>>> of the function. >>>>>>> >>>>>>> Yes, this function needs to hold locks, and more than just >>>>>>> what's there now. I believe the following should be correct; I haven't >>>>>>> tested it, though (I'm supposedly on vacation right now). >>>>>>> >>>>>>> The following change should be correct for the >>>>>>> bonding_store_primary case discussed in this thread, and also corrects >>>>>>> the bonding_store_active case which performs similar functions. >>>>>>> >>>>>>> The bond_change_active_slave and bond_select_active_slave >>>>>>> functions both require rtnl, bond->lock for read and curr_slave_lock >>>>>>> for >>>>>>> write_bh, and no other locks. This is so that the lower level >>>>>>> mode-specific functions can release locks down to just rtnl in order to >>>>>>> call, e.g., dev_set_mac_address with the locks it expects (rtnl only). >>>>>>> >>>>>>> Signed-off-by: Jay Vosburgh <fubar@...ibm.com> >>>>>>> >>>>>>> diff --git a/drivers/net/bonding/bond_sysfs.c >>>>>>> b/drivers/net/bonding/bond_sysfs.c >>>>>>> index 11b76b3..28a2d80 100644 >>>>>>> --- a/drivers/net/bonding/bond_sysfs.c >>>>>>> +++ b/drivers/net/bonding/bond_sysfs.c >>>>>>> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct >>>>>>> device >>>>>>> *d, >>>>>>> struct slave *slave; >>>>>>> struct bonding *bond = to_bond(d); >>>>>>> >>>>>>> - write_lock_bh(&bond->lock); >>>>>>> + rtnl_lock(); >>>>>>> + read_lock(&bond->lock); >>>>>>> + write_lock_bh(&bond->curr_slave_lock); >>>>>>> + >>>>>>> if (!USES_PRIMARY(bond->params.mode)) { >>>>>>> printk(KERN_INFO DRV_NAME >>>>>>> ": %s: Unable to set primary slave; %s is in mode >>>>>>> %d\n", >>>>>>> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct >>>>>>> device >>>>>>> *d, >>>>>>> } >>>>>>> } >>>>>>> out: >>>>>>> - write_unlock_bh(&bond->lock); >>>>>>> - >>>>>>> + write_unlock_bh(&bond->curr_slave_lock); >>>>>>> + read_unlock(&bond->lock); >>>>>>> rtnl_unlock(); >>>>>>> >>>>>>> return count; >>>>>>> @@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct >>>>>>> device *d, >>>>>>> struct bonding *bond = to_bond(d); >>>>>>> >>>>>>> rtnl_lock(); >>>>>>> - write_lock_bh(&bond->lock); >>>>>>> + read_lock(&bond->lock); >>>>>>> + write_lock_bh(&bond->curr_slave_lock); >>>>>>> >>>>>>> if (!USES_PRIMARY(bond->params.mode)) { >>>>>>> printk(KERN_INFO DRV_NAME >>>>>>> @@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct >>>>>>> device *d, >>>>>>> } >>>>>>> } >>>>>>> out: >>>>>>> - write_unlock_bh(&bond->lock); >>>>>>> + write_unlock_bh(&bond->curr_slave_lock); >>>>>>> + read_unlock(&bond->lock); >>>>>>> rtnl_unlock(); >>>>>>> >>>>>>> return count; >>>>>> >>>>>> Vanilla 2.6.24-rc5 plus this patch: >>>>>> >>>>>> ========================================================= >>>>>> [ INFO: possible irq lock inversion dependency detected ] >>>>>> 2.6.24-rc5 #1 >>>>>> --------------------------------------------------------- >>>>>> events/0/9 just changed the state of lock: >>>>>> (&mc->mca_lock){-+..}, at: [<c0411c7a>] mld_ifc_timer_expire+0x130/0x1fb >>>>>> but this lock took another, soft-read-irq-unsafe lock in the past: >>>>>> (&bond->lock){-.--} >>>>>> >>>>>> and interrupts could create inverse lock ordering between them. >>>>>> >>>>>> >>>>> >>>>> Grrr, I should have seen that -- sorry. Try your luck with this instead: >>>> <CUT> >>>> >>>> No luck. >>>> >>> >>> >>> I'm guessing if we go back to using a write-lock for bond->lock this >>> will go back to working again, but I'm not totally convinced since there >>> are plenty of places where we used a read-lock with it. >> >> Should I check this patch or rather, based on a future discussion, wait >> for another version? >> >>> >>> diff --git a/drivers/net/bonding/bond_sysfs.c >>> b/drivers/net/bonding/bond_sysfs.c >>> index 11b76b3..635b857 100644 >>> --- a/drivers/net/bonding/bond_sysfs.c >>> +++ b/drivers/net/bonding/bond_sysfs.c >>> @@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device >>> *d, >>> struct slave *slave; >>> struct bonding *bond = to_bond(d); >>> >>> + rtnl_lock(); >>> write_lock_bh(&bond->lock); >>> + write_lock_bh(&bond->curr_slave_lock); >>> + >>> if (!USES_PRIMARY(bond->params.mode)) { >>> printk(KERN_INFO DRV_NAME >>> ": %s: Unable to set primary slave; %s is in mode >>> %d\n", >>> @@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device >>> *d, >>> } >>> } >>> out: >>> + write_unlock_bh(&bond->curr_slave_lock); >>> write_unlock_bh(&bond->lock); >>> - >>> rtnl_unlock(); >>> >>> return count; >>> @@ -1191,6 +1194,7 @@ static ssize_t bonding_store_active_slave(struct >>> device *d, >>> >>> rtnl_lock(); >>> write_lock_bh(&bond->lock); >>> + write_lock_bh(&bond->curr_slave_lock); >>> >>> if (!USES_PRIMARY(bond->params.mode)) { >>> printk(KERN_INFO DRV_NAME >>> @@ -1247,6 +1251,7 @@ static ssize_t bonding_store_active_slave(struct >>> device *d, >>> } >>> } >>> out: >>> + write_unlock_bh(&bond->curr_slave_lock); >>> write_unlock_bh(&bond->lock); >>> rtnl_unlock(); >>> >> >> >> Best regards, >> >> Krzysztof Olędzki > > For now, I prefer Jay's original patch -- with the read_locks (rather > than read/write_lock_bh) and the added rtnl_lock. There is still a > lockdep issue that we need to sort-out, but this patch is needed first. This bug has not been fixed yet as it still exists in 2.6.24-rc7. Any chances to cure it before 2.6.24-final? Best regards, Krzysztof Olędzki
Powered by blists - more mailing lists