lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0901301506120.3150@localhost.localdomain>
Date:	Fri, 30 Jan 2009 15:06:30 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Parag Warudkar <parag.lkml@...il.com>
cc:	Matt Carlson <mcarlson@...adcom.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"David S. Miller" <davem@...emloft.net>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: 2.6.29-rc3: tg3 dead after resume


On Fri, 30 Jan 2009, Parag Warudkar wrote:
> 
> [  245.924484] eth0: PCI_COMMAND reg = 0x406 (bit 1 is on)
> [  245.924487] eth0: Reg value at offset 0x0 is 0xffffffff
> [  247.317971] tg3: eth0: No firmware running.
> [  258.710634] ADDRCONF(NETDEV_UP): eth0: link is not ready
> ^^^ Post-Suspend
> 
> So it looks like the memory space IO is enabled before and after suspend.
> The device/vendor id goes 0xffffffff after resume - just like before.
> Does that one matter? (Firmware may be looking at it?) 

One thing strikes me - are there any bridges between the host (CPU) and 
that tg3 device?

Because we obviously have two people who say that their tg3 suspend/resume 
works fine, so the tg3 driver is obviously not _totally_ broken. So I'm 
wondering if there is something funny in between the CPU and the tg3, like 
a hotplug bridge that needs magic to wake up properly.

Because clearly the PCI config space addresses are working fine, but the 
thing is, while PCI config space accesses are routed by the device number 
(and the bridges notion of secondary bridging), the PCI memory space 
routing is based on address. So a PCI bridge can easily get one right (in 
fact, it's really hard to get config space accesses wrong without the 
bridges being _totally_ screwed up), while not routing the other at all.

So just do that "lspci -vvxxx" for the whole box, before and after, and 
send us the "before" and the "diff -u before after" thing, and maybe that 
shows something interesting. Because some bridge chip being confused would 
also explain why a total re-init of the whole tg3 chip by a driver unload 
and reload doesn't seem to help.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ