lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pq72a951.fsf@xmission.com>
Date:	Tue, 07 Aug 2012 10:45:30 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Suresh Siddha <suresh.b.siddha@...el.com>
Cc:	Borislav Petkov <bp@...64.org>,
	Robert Richter <robert.richter@....com>, mingo@...nel.org,
	hpa@...or.com, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, torvalds@...ux-foundation.org,
	a.p.zijlstra@...llo.nl, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org,
	"Petkov\, Borislav" <borislav.petkov@....com>
Subject: Re: do_IRQ: 1.55 No irq handler for vector (irq -1)

Suresh Siddha <suresh.b.siddha@...el.com> writes:

> On Tue, 2012-08-07 at 17:41 +0200, Borislav Petkov wrote:
>> On Tue, Aug 07, 2012 at 05:31:49PM +0200, Robert Richter wrote:
>> > On 06.06.12 08:03:58, tip-bot for Suresh Siddha wrote:
>> > > Commit-ID:  332afa656e76458ee9cf0f0d123016a0658539e4
>> > > Gitweb:     http://git.kernel.org/tip/332afa656e76458ee9cf0f0d123016a0658539e4
>> > > Author:     Suresh Siddha <suresh.b.siddha@...el.com>
>> > > AuthorDate: Mon, 21 May 2012 16:58:01 -0700
>> > > Committer:  Ingo Molnar <mingo@...nel.org>
>> > > CommitDate: Wed, 6 Jun 2012 09:51:22 +0200
>> > > 
>> > > x86/irq: Update irq_cfg domain unless the new affinity is a subset of the current domain
>> > 
>> > This commit causes a sata error and thus a boot failure:
>> > 
>> >  ACPI: Invalid Power Resource to register!ata1: lost interrupt (Status 0x50)
>> >  ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x40000000 action 0x6 frozen
>> >  ata1: SError: { }
>> >  ata1.00: failed command: READ DMA
>> > 
>> > Reverting it as following helped:
>> > 
>> >  $ git revert d872818dbbeed1bccf58c7f8c7db432154c802f9
>> >  $ git revert 1ac322d0b169c95ce34d55b3ed6d40ce1a5f3a02
>> >  $ git revert 332afa656e76458ee9cf0f0d123016a0658539e4
>> 
>> Right,
>> 
>> and it is a good thing Robert and I were talking about his issue and I
>> mentioned seeing funny do_IRQ messages during 3.6-rc1 boot:
>> 
>> [    0.170256] AMD PMU driver.
>> [    0.170451] ... version:                0
>> [    0.170683] ... bit width:              48
>> [    0.170906] ... generic registers:      6
>> [    0.171125] ... value mask:             0000ffffffffffff
>> [    0.171399] ... max period:             00007fffffffffff
>> [    0.171673] ... fixed-purpose events:   0
>> [    0.171902] ... event mask:             000000000000003f
>> [    0.172687] MCE: In-kernel MCE decoding enabled.
>> [    0.184214] [Firmware Info]: CPU: Re-enabling disabled Topology Extensions Support
>> [    0.186687] do_IRQ: 1.55 No irq handler for vector (irq -1)				<---
>> [    0.198126] [Firmware Info]: CPU: Re-enabling disabled Topology Extensions Support
>> [    0.200579] do_IRQ: 2.55 No irq handler for vector (irq -1)				<---
>> [    0.173040] smpboot: Booting Node   0, Processors  #1 #2 #3 OK
>> [    0.212083] [Firmware Info]: CPU: Re-enabling disabled Topology Extensions Support
>> [    0.214538] do_IRQ: 3.55 No irq handler for vector (irq -1)				<---
>> [    0.214864] Brought up 4 CPUs
>> 
>> of it now having IRQ handler for vector 55.
>> 
>> And guess what: reverting those three above make the message go away
>> too.
>> 
>
> Boris, Robert, Can you please send me the complete dmesg
> and /proc/interrupts on a successful boot?

Hmm.  I wonder if this is one of those cases where the apics don't honor
the masks in lowest priority delivery mode and simply deliver to some
cpu in the same die.

Certainly outside of x2apic mode I have seen that happen and that is why
the reservation in lowest priroity delivery mode was for the same vector
across all cpus.

This certainly looks like we have one irq going across multiple cpus
and the software simply appears unprepared for the irq to show up where
the irq is showing up.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ