lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 13 Feb 2010 10:18:27 -0800
From:	Suresh Siddha <suresh.b.siddha@...el.com>
To:	Torsten Kaiser <just.for.lkml@...glemail.com>
Cc:	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Tejun Heo <tj@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Robert Hancock <hancockrwd@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Yinghai Lu <yhlu.kernel@...il.com>
Subject: Re: do_IRQ: 0.165 No irq handler for vector (irq -1)

On Sat, 2010-02-13 at 02:25 -0700, Torsten Kaiser wrote:
> Ping?
> 
> I reported this problem one day after -rc1 was out and it's still
> there in -rc8, the probably last -rc for 2.6.33.
> (I also reported it against -rc2, -rc3, -rc4 and -rc6)
> 
> Apart from the patches related to the SiI register HOST_CTRL_MSIACK
> (that did not fix the problem) I have the feeling, that I'm not one
> step further to any fix.
> 
> Is this a bug in the MSI-enable code in sata_sil24?
> Is this a bug in the MSI code in libata?
> Is this a bug in the IRQ system?
> Is this a bug in the x86 apic code?

There are primarily two issues you reported.

One is the spurious interrupt issue (for which you see "no irq handler
for vector messages). From your experimental results you verified that
this problem doesn't happen in physical apic mode. This shows that there
is some problem with the way this HW subsystem (involving sata_sil24)
handles logical mode. Most likely some bug either in the sata_sil24 or
in the platform paths (bridges etc) handling the sata_sil24 interrupts
(as you say, other devices work fine with MSI on this platform).

And the second problem is the sata timeouts (which happen irrespective
of the above spurious interrupts). It looks like interrupts are dropped
(which might be the reason why your ERR count -- apic error count --
increases). 

Based on your experimental results, we can say that it is not the bug
with x86 apic code and irq subsystem.

> Is this a hardware bug in the SiI 3132?
> Is this a hardware bug in the MCP55?
> Is this a fatal bug or does it just need the right quirk?
> 
> What should I do now?
> Keep posting that it's still broken at each -rc?
> Open a bug at bugzilla.kernel.org? Against what subsytem?
> Should I just not use the sata_sil.msi=1 commandline? 

You should n't use that command line as your experiments showed that
sata_sil msi mode is clearly broken on this platform and perhaps report
the issue to the HW vendor (you should include in that report, the
spurious vector 165 that you see in logical mode and also the apic error
you see -- you can enable debug to see the error message that gets
printed in smp_error_interrupt() for this --)

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ