linux-kernel - Re: Regression from 2.6.26: Hibernation (possibly suspend) broken on Toshiba R500 (bisected)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0812040754020.3256@nehalem.linux-foundation.org>
Date:	Thu, 4 Dec 2008 08:17:26 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Frans Pop <elendil@...net.nl>
cc:	"Rafael J. Wysocki" <rjw@...k.pl>, Greg KH <greg@...ah.com>,
	Ingo Molnar <mingo@...e.hu>, jbarnes@...tuousgeek.org,
	lenb@...nel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	tiwai@...e.de, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: Regression from 2.6.26: Hibernation (possibly suspend) broken
 on Toshiba R500 (bisected)



On Thu, 4 Dec 2008, Frans Pop wrote:

> On Wednesday 03 December 2008, Linus Torvalds wrote:
> > Well, I think that what _would_ be generally correct, and actually
> > pretty simple, is a rather different approach: just not sizing things
> > behind a transparent bridge AT ALL, since it really shouldn't matter.
> 
> I've given your patch a try and the few resumes from STR I've done were 
> all successful. That's not 100% conclusive yet, but a nice start.
> Some info from logs etc. below.

Ok, but I thought you had a hard time reproducing this _anyway_, even with 
just plain -rc7. No?

That said, of the various patches posted, the "don't bother allocating 
bridging windows for transparent bridges" one is not just the simplest, 
but the only one that actually makes sense so far.

So I'm happy it's apparently working for you, I'm just wondering about 
whather your success means a lot. It seems that Rafael is the one who had 
more failures?

> > > Also, I would be happy to actually understand _why_ this happens.
> >
> > 100% agreed. I do _not_ see why it should ever matter how we set up a
> > PCI bridging window - whether prefetchable or not - on a bridge that
> > should be transparent. It sounds really odd. I'm wondering if there is
> > something we're missing here.
> 
> The theory that it is really a resume issue and not a device layout issue 
> sounds logical. Especially as everything always works correctly after a 
> normal boot.

Yes, that does sound like a convincing argument. Usually real PCI resource 
clashes result in some kind of run-time problems, and wouldn't necessarily 
be suspend-specific per se.

That said, suspend/resume does a lot of unusual things, so it could still 
be some odd PCI resource clash that only triggers problems in the 
suspend/resume case. But since the exact layouts and the sizing of the 
resources doesn't really seem to matter, a simple PCI resource clash seems 
rather unlikely.

So some kind of resume-time ordering or timing issue does seem like the 
most likely thing. But that still leaves us not knowing what the real 
_root_ cause of this all is - very irritating. Even if not allocating the 
unnecessary bridging windows "fixes" things, it would be really really 
good to know exactly what it is that causes problems.

> Below info from 3 kernels, all based on 2.6.28-rc7-91:
> A) unpatched
> B) with the revert/debug patch
> C) with the oneliner "ignore transparent bridges" patch
> 
> AFAICT all results are probably as expected.
> 
> From lspci -vvxxx:
> 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge
> - for A)
> 	I/O behind bridge: 00003000-00003fff
> 	Memory behind bridge: e0100000-e03fffff
> 	Prefetchable memory behind bridge: 0000000080000000-0000000083ffffff
> - for B)
> 	I/O behind bridge: 00003000-00003fff
> 	Memory behind bridge: e0100000-e03fffff
> - for C)
> 	Memory behind bridge: e0100000-e03fffff

And this all makes total sense. The e0100000-e03fffff MMIO bridge range is 
apparently set up by the firmware, which is why it shows up in all cases. 
And the (A) case has that prefetchable memory range, because that's the 
only case that finds - and cares about - the prefetch window for the 
CardBus controller. 

And both (A) and (B) have the IO bridging window, because regardless of 
whether we see a valid CardBus prefetchable memory window with good 
alignment, we'll always see the IO ports, so we'll try to allocate that 
bridging window, except in (C) when we decide that due to the transparent 
nature, we simply don't care.

So the PCI resources make sense in all three cases, and we understand 
those. The differences in the actual Cardbus ranges also all make sense. 
So it all still boils down to the PCI layer doing everything right in 
_all_ cases, just making slightly different - but all valid - choices 
depending on essentially random details (eg the revert/debug patch case 
the "random detail" is just enabling a small incorrect alignment).

IOW, it really doesn't look like a PCI resource allocator bug. Quite the 
reverse, I'd say that in the end this whole thread points out just how 
robust the whole PCI and cardbus resource allocation is, with the code 
really very gracefully just adjusting in a sane manner to all these 
different cases.

Of course, none of that helps us with any kind of idea of what the real 
problem is. Device ordering bug in setting up PCI resources at resume? 
Perhaps just a plain bug in PCI bridge resume code (even when you resume 
things in the right order)?

And I still worry that perhaps it's just a timing bug, where having a PCI 
bridging window changes timing of various PCI accesses, and the _real_ bug 
is actually in the sound card or ethernet driver resume, which happens to 
work with one timing and not with another.

Since it's apparently STR, has anybody gotten _anything_ sane out of 
trying to enable PM_TRACE_RTC, and then doing that 

	echo 1 > /sys/power/pm_trace

because even with the (very limited) set of standard trace-points, it 
should still be able to tell which device we were trying to resume last in 
the failure case Maybe that gives some hint?

				Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/