linux-kernel - Re: "APIC error on CPU1: 00(40)" during resume (was: Regression from 2.6.26: Hibernation (possibly suspend) broken on Toshiba R500)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081210173343.GA1120@elte.hu>
Date:	Wed, 10 Dec 2008 18:33:43 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Yinghai Lu <yinghai@...nel.org>,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	"Maciej W. Rozycki" <macro@...ux-mips.org>,
	"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>
Cc:	Frans Pop <elendil@...net.nl>, lenb@...nel.org,
	"Rafael J. Wysocki" <rjw@...k.pl>, Greg KH <greg@...ah.com>,
	jbarnes@...tuousgeek.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	tiwai@...e.de, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: "APIC error on CPU1: 00(40)" during resume (was: Regression
	from 2.6.26: Hibernation (possibly suspend) broken on Toshiba R500)


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> Ingo - who's the main apic person these days?

When it comes to blame someone for bugs then it's me :-)

When it comes to code details, it's multiple people: Yinghai, Suresh, 
Venki, Maciej and Thomas, Peter and me as x86 maintainers. I tried to Cc: 
everyone.

> On Wed, 10 Dec 2008, Frans Pop wrote:
> 
> > On Wednesday 10 December 2008, Linus Torvalds wrote:
> > > On Wed, 10 Dec 2008, Frans Pop wrote:
> > > > Anybody interested in persuing this issue?
> > > >
> > > > > The third thing that worries me is the _very_ early occurrence of
> > > > >
> > > > > 	ACPI: Waking up from system sleep state S3
> > > > > 	APIC error on CPU1: 00(40)
> > > > > 	ACPI: EC: non-query interrupt received, switching to interrupt
> > > > > mode
> > >
> > > Well, the "too early" part is fixed with the PCI resume changes in
> > > -next, and googling for "APIC error on CPU1: 00(40)" shows that it's
> > > actually pretty common. Which is sad, but makes it somewhat less scary.
> > >
> > > The fact that it happens at resume for you (and not randomly) does
> > > imply that we perhaps don't have a wonderful APIC wakeup sequence and
> > > are doing something slightly wrong. But it likely isn't a big deal.
> > >
> > > Is that message new? If it is, maybe you can pinpoint roughly when it
> > > started happening, and we could try guess which change triggered it.
> > 
> > It's been there since 2.6.26.3, which was the first kernel I've run on 
> > this notebook.
> 
> Hmm. Our IO-APIC reprogramming looks pretty simple, and may well be 
> correct.
> 
> However, it looks like our _local_ APIC suspend/resume is a total piece 
> of sh*t.  It's set up as a "system device" and has a single 
> suspend/resume buffer, but the local APIC is a per-CPU thing. We even 
> have a comment there (written by yours trule back in 2003!) that says:
> 
>          * FIXME! This will be wrong if we ever support suspend on
>          * SMP! We'll need to do this as part of the CPU restore!
> 
> and back then suspend/resume on SMP was just a crazy notion, but now 
> it's obviously every-day reality.
> 
> So it looks like we don't reprogram the APIC -at-all- on secondary 
> CPU's.
> 
> What am I missing?

we do reset it - local APIC timer IRQs would not be working, the NMI 
watchdog wouldnt be working, we wouldnt be able to do cross-IPIs nor TLB 
flushes etc. - so a non-working lapic is the sure way to a system lockup.
But the resume/hotplug path is still a maze, agreed.

regarding those APIC error messages:

> > > > >       ACPI: Waking up from system sleep state S3
> > > > >       APIC error on CPU1: 00(40)
> > > > >       ACPI: EC: non-query interrupt received, switching to interrupt

that does suggest that the APIC was re-enabled (we dont get any APIC 
error exceptions otherwise!), and its LVT was programmed as well, but 
somehow we got an erroneous APIC message from an illegal vector.

Illegal APIC message vectors can have two sources in practice:

 1) the system bus being thermally unstable and corrupting APIC messages 
    that would randomly contain the wrong vector (zero for example).
    I had one (old) testbox that would do this. (Maybe other hw 
    conditions can animate the southbridge to do this to us too.)

 2) _another CPU_ sending an IPI with an illegal vector field. An APIC 
    vector is 'illegal' if it is below 16 (architecturally protected 
    exception entries), or if it points to an IDT entry that is not
    present.

#2: in this case would mean another CPU has set a target vector smaller
than 16: we dont have any IDT entry that is explicitly non-existent (we 
have a dummy entry mapped for everything). That seems unlikely - but we 
could stick in a WARN_ON_ONCE() into the IPI send methods to catch this. 

[ sidenote: as weird as it might seem it is valid IPI use to trigger
  architectural exception vectors between 16 and 31. ]

#1: seems like a too easy path out to blame the hw for it :-/ By all 
means this has the appearance of kernel-induced damage to me, and seems 
to occur when we fiddle the hw and are in a sensitive path of 
resume-wakeup. I'd blame the kernel 9 times out of 10 bugs that trigger 
at this stage.

Still i have no idea how this APIC message could be kernel-inflicted - 
even assuming buggy resume time lapic setup. The lapic timer cannot 
inject 'bad' vectors to itself AFAIK. It's pretty hard to do it even 
intentionally from another CPU, and when we do it we kill the whole 
system by flooding it with bad IPIs.

Maybe we have a window of setup where one of the LVT entries has zero in 
the vector field but is still enabled, and the hw condition (lapic timer 
tick) happens that triggers the IRQ injection - but the lapic cannot do 
it due to the zero vector? I do not see such window of setup in the APIC 
re-setup codepath though.

Maybe it's the reschedule or TLB flush IPI from another CPU somehow 
hitting the lapic in the wrong moment?

Dunno. Does anyone else on the Cc: list have another theory that matches 
up with some detail of the code?

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/