lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 10 Dec 2008 16:37:46 -0800 From: "Jeff Kirsher" <jeffrey.t.kirsher@...el.com> To: "Andrew Morton" <akpm@...ux-foundation.org> Cc: "Frederik Deweerdt" <frederik.deweerdt@...og.eu>, e1000-devel@...ts.sourceforge.net, netdev@...r.kernel.org, jesse.brandeburg@...el.com, linux-kernel@...r.kernel.org, stable@...nel.org, tglx@...utronix.de, zdenek.kabelac@...il.com, davem@...emloft.net Subject: Re: [E1000-devel] BUG: bad unlock balance detected! e1000e On Tue, Dec 9, 2008 at 3:56 PM, Andrew Morton <akpm@...ux-foundation.org> wrote: > On Wed, 10 Dec 2008 00:43:46 +0100 > Frederik Deweerdt <frederik.deweerdt@...og.eu> wrote: > >> On Tue, Dec 09, 2008 at 03:08:01PM -0800, Andrew Morton wrote: >> > On Tue, 9 Dec 2008 12:03:37 +0100 >> > Frederik Deweerdt <frederik.deweerdt@...og.eu> wrote: >> > >> > > It some error checking is missing in e1000e: debug contention on NVM >> > > SWFLAG >> > > On Mon, Dec 08, 2008 at 12:24:09PM +0100, Zdenek Kabelac wrote: >> > > > Hi >> > > > >> > > > During occasional scan of message log - I've found out this BUG which >> > > > happened on Dec3 with the -rc7 from that day. >> > > > (So if it's now fixed in current git feel free to ignore :)) >> > > > >> > > > My machine T61 - C2D, 2GB, 64bit kernel - message appeared during >> > > > shutdown and was actually not noticed by me... >> > > > >> > > > >> > > > NetworkManager: <WARN> nm_signal_handler(): Caught signal 15, >> > > > shutting down normally. >> > > > NetworkManager: <info> (eth0): now unmanaged >> > > > NetworkManager: <info> (eth0): device state change: 3 -> 1 >> > > > NetworkManager: <info> (eth0): cleaning up... >> > > > NetworkManager: <info> (eth0): taking down device. >> > > > >> > > > ===================================== >> > > > [ BUG: bad unlock balance detected! ] >> > > > ------------------------------------- >> > >> > (top-posting repaired. Please don't do that!!!). >> Yep, sorry. >> > >> > > Hello Zdenek, >> > > >> > > This could be due to 717d438d1fde94decef874b9808379d1f4523453 >> > > "e1000e: debug contention on NVM SWFLAG" >> > > Error handling is missing from e1000_reset_hw_ich8lan so it may happen >> > > that we don't acquire the nvm_mutex if the card times out. >> > > >> > > Adding Thomas to CC. >> > >> > yup. 2.6.27 needs fixing also. >> > >> > Like this? >> I don't think so, e1000_acquire_swflag_ich8lan() locks and >> e1000_release_swflag_ich8lan() unlocks. > > urgh, OK, I made the mistake of reading the comments. > >> I think it is more along the >> lines of: >> >> >> diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c >> index 523b971..f971b83 100644 >> --- a/drivers/net/e1000e/ich8lan.c >> +++ b/drivers/net/e1000e/ich8lan.c >> @@ -1892,7 +1892,13 @@ static s32 e1000_reset_hw_ich8lan(struct e1000_hw *hw) >> */ >> ctrl |= E1000_CTRL_PHY_RST; >> } >> + >> ret_val = e1000_acquire_swflag_ich8lan(hw); >> + if (ret_val) { >> + hw_dbg(hw, "Failed to acquire NVM swflag"); >> + return ret_val; >> + } >> + >> hw_dbg(hw, "Issuing a global reset to ich8lan"); >> ew32(CTRL, (ctrl | E1000_CTRL_RST)); >> msleep(20); >> >> >> But I'm not sure we should cancel the ongoing reset if the card times >> out... >> > > Yes, something like that. Or something like > > --- a/drivers/net/e1000e/ich8lan.c~a > +++ a/drivers/net/e1000e/ich8lan.c > @@ -1940,12 +1940,14 @@ static s32 e1000_reset_hw_ich8lan(struct > ctrl |= E1000_CTRL_PHY_RST; > } > ret_val = e1000_acquire_swflag_ich8lan(hw); > - hw_dbg(hw, "Issuing a global reset to ich8lan\n"); > - ew32(CTRL, (ctrl | E1000_CTRL_RST)); > - msleep(20); > + if (!ret_val) { > + hw_dbg(hw, "Issuing a global reset to ich8lan\n"); > + ew32(CTRL, (ctrl | E1000_CTRL_RST)); > + msleep(20); > > - /* release the swflag because it is not reset by hardware reset */ > - e1000_release_swflag_ich8lan(hw); > + /* release the swflag because it is not reset by hardware reset */ > + e1000_release_swflag_ich8lan(hw); > + } > > ret_val = e1000e_get_auto_rd_done(hw); > if (ret_val) { > _ > > > Dunno. It's e1000-developer-summoning-dance time. > Actually, if we time out trying to acquire the swflag, we still want to reset the part because we are most likely in an unrecoverable state. So I would suggest the following --- a/drivers/net/e1000e/ich8lan.c~a +++ a/drivers/net/e1000e/ich8lan.c @@ -1940,9 +1940,10 @@ static s32 e1000_reset_hw_ich8lan(struct ctrl |= E1000_CTRL_PHY_RST; } ret_val = e1000_acquire_swflag_ich8lan(hw); hw_dbg(hw, "Issuing a global reset to ich8lan\n"); ew32(CTRL, (ctrl | E1000_CTRL_RST)); msleep(20); + if (!ret_val) { - - /* release the swflag because it is not reset by hardware reset */ - e1000_release_swflag_ich8lan(hw); + /* release the swflag because it is not reset by hardware reset */ + e1000_release_swflag_ich8lan(hw); + } Of course, we will want to add a comment to the fact that we still want to reset the part, even if we have not acquired the lock because we are in an unrecoverable state. I can provide a patch in a few minutes. -- Cheers, Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists