lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 17 May 2007 17:43:55 -0700
From:	"Siddha, Suresh B" <suresh.b.siddha@...el.com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	"Siddha, Suresh B" <suresh.b.siddha@...el.com>, mingo@...e.hu,
	ak@...e.de, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, "Zou, Nanhai" <nanhai.zou@...el.com>,
	"Mallick, Asit K" <asit.k.mallick@...el.com>,
	"Packard, Keith" <keith.packard@...el.com>
Subject: Re: [patch] x86_64, irq: check remote IRR bit before migrating level triggered irq

On Thu, May 17, 2007 at 04:58:02PM -0700, Eric W. Biederman wrote:
> In essence this makes sense, and it may be the best work around for
> buggy hardware available.   However I am not convinced that the remote
> IRR on ioapics works reliably enough to be used for anything.   I
> tested this earlier and I could not successfully poll the remote irr
> bit to see if an ioapic had received an irq acknowledgement.  Instead
> I locked up irq controllers.
> 
> If remote IRR worked reliably I could have pulled the irq migration
> out of irq context.  So this fix looks dubious to me.

We are just reading IOAPIC RTE's. So not sure why this can lead to lock
up's. This patch is not reading any new HW registers. It just
checks for certain bit values from the register that the code reads
anyhow.

Remote IRR should work reliably otherwise we will see lot more issues,
I guess.

> 
> Why is the EOI delayed?  Can we work around that?

Under some stress conditions, EOI can get retried because of which
it is getting delayed.

> 
> It would be nice if ioapics and local apics actually obeyed the pci
> ordering rules where a read would flush all traffic.  And we do have a
> read in there.
> 
> I'm assuming the symptom you are seeing on current kernels is that
> occasionally
> the irq gets stuck and never fires again?

yes.

> I'm not certain I like the patch either, but I need to look more closely.
> You are mixing changes to generic and arch specific code together.
> 
> I think pending_eoi should not try to reuse __DO_ACTION as that
> helper bit of code does not seem appropriate.

Wanted to minimize the code duplicacy and hence overloaded existing
macros.

> 
> It would probably be best if the pending_eoi check was in
> ack_apic_level with the rest of the weird logic we are working around.

Just wanted to make sure that we give some time for the EOI to actually
reach the IO-APIC.

> 
> Could we please have more detail on this hardware behavior.  Why is
> the EIO write write delayed?   Why can't we just issue reads in
> appropriate places to flush the write?

Because the EOI to IOAPIC gets generated as a side effect of processor
EOI.

thanks,
suresh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ