linux-kernel - Re: 2.6.29-rc3: tg3 dead after resume

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0901311318570.4064@localhost.localdomain>
Date:	Sat, 31 Jan 2009 13:47:22 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"Rafael J. Wysocki" <rjw@...k.pl>
cc:	Parag Warudkar <parag.lkml@...il.com>,
	Matt Carlson <mcarlson@...adcom.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"David S. Miller" <davem@...emloft.net>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: 2.6.29-rc3: tg3 dead after resume



On Sat, 31 Jan 2009, Rafael J. Wysocki wrote:
> > 
> > Don't make it use the new PM infrastructure, because that one is certainly 
> > broken: pci_pm_default_suspend_generic() is total crap.
> 
> Oh my.  I beg to differ.

Imagine that you're a bridge. Imagine what this does to any device 
_behind_ you. Any device that plans to still do something at suspend_late 
time, to be specific.

> > It's saving the disabled state. No WAY is that correct.
> 
> So why does it work, actually?

And I suspect it simply doesn't. And since almost nobody uses the new PM 
state, you just don't see it.

But as to why it fixes Parag's case - I think that's because the new PM 
resume does more than the legacy resume does, so it ends up re-enabling 
things anyway. It does it too late, but it doesn't matter in this case (no 
shared irq issues with the only device behind the pci-e bridge).

> > But all of that is only called if you use the new PM infrastructure. So 
> > the thing is, when you're trying to move the PCI-E drive to the new pm 
> > infrastructure, you're making things _worse_.
> 
> I don't really think so (see below).

See above. I think you really haven't thought the new PM code through.

> pci_disable_device() does really only one thing: it clears the bus master bit.
> Yes, it also calls pcibios_disable_device(), but on x86 this is a NOP.
> 
> I don't think it is SO bad, is it?

It's bad. It means that DMA won't work across such a bridge. Yes, it is 
probably bridge-dependent, and I know for a fact that at least some Intel 
bridges just hard-code the busmaster bit to 1 (at a minimum the host 
bridges do, I'm not sure about others), but I also know for a fact that 
some other bridges _will_ stop DMA to devices behind them if the BM bit is 
clear.

But more importantly: Why do you do it? What's the upside? I don't see it. 
There's a known downside - you're saving state that is something else than 
what the driver really expects. 

So I think clearing bus-master is a huge bug on a bridge, but I think that 
on normal devices it's just pointless.

> The driver using the new model is not supposed to save the state and power
> off the device.  Still, it's probably a good idea not to trust the drivers. :-)

How about devices that have magic power-down sequences? For example, a 
quick grep shows that USB on a PPC PMAC has a special "disable ASIC clocks 
for USB" thing after it puts the USB controller to sleep.

That was literally the _first_ driver I looked at. Admittedly because I 
knew that USB host controllers tend to be more aware of all the issues 
than most random drivers, but still...

I agree that the new model should turn off devices by default, but the 
thing is, it should also allow drivers that really know better to do magic 
things.

Of course, we can say that such devices should just continue to use the 
legacy model, but I thought that the long-term plan was to just replace 
it. And if you do, you need to allow for drivers that do special things 
due to known motherboard- or chip-specific "issues" (aka "PCI extensions" 
aka "hardware bugs").

And yes, I suspect that the magic PPC USB clock thing could maybe be 
rewritten as a system device, so there are other alternatives here, but I 
do suspect that it will be very painful if the new PM layer _forces_ a 
very specific model on drivers that they can't modify at all.

> > I don't see why we don't resume with IO/MEM on, though. The legacy suspend 
> > sequence shouldn't disable them, afaik.
> 
> No, it shouldn't.
> 
> However, the patch below actually may help and it really is not too different
> from my "new infrastructure" patch.  It leaves the pcie_portdrv_restore_config()
> in the PCIe port driver's ->resume(), but that shouldn't change things,
> pci_enable_device() in there shouldn't do anything and the bus master bit
> should already be set.

I absolutely agree with this patch. I'd just not expect it to make a 
difference (except for the "cleanup factor"). I think it's worth applying, 
and _if_ it makes a difference for Parag it's very interesting indeed.

> The "new infrastracture" patch makes pci_disable_enabled_device() be called in
> the suspend code path, but that only disables the bus master bit, and
> pci_reenable_device() be called in the resume code path, but that only sets the
> bus master bit back again.  So, they are almost the same and I'd be surprised
> if your patch didn't help.

Hmm. We'll see. I'm a bit doubtful. But we'll see..

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/