lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPRPZsBV-qJswWDfwMYjidEDsRzR-r5TG0or1--k599=Cju+tw@mail.gmail.com>
Date:	Mon, 30 Apr 2012 12:41:24 +0200
From:	Jeroen Van den Keybus <jeroen.vandenkeybus@...il.com>
To:	Clemens Ladisch <clemens@...isch.de>
Cc:	Josh Boyer <jwboyer@...il.com>,
	Borislav Petkov <borislav.petkov@....com>, andymatei@...il.com,
	"Huang, Shane" <Shane.Huang@....com>,
	Borislav Petkov <bp@...64.org>, linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: Unhandled IRQs on AMD E-450

> Why 5?  This threshold is likely to be too low; fast consecutive interrupts
> can easily happen more often with a very busy device, while an actual stuck
> interrupt will call the handler in an endless loop and very quickly result
> in many thousands of calls.

Well, 5 works fine on any machine I have tested so far. I'd like to
keep this number as low as possible in case a genuine stuck interrupt
is encountered. Computers are powerful, but I'm reluctant to spill
cycles and power.

Also, on an unshared interrupt line, unhandled IRQs should never
happen in succession. No work to be done by a handler should be the
result of acknowledging early and getting a new interrupt when work
grows in the meantime. After the resulting idle run there's no way a
properly working driver could end up being interrupted again for no
reason (aside from broken drivers and broken hardware, i.e. hardware
emitting MSIs without getting acknowledgement). Am I right ?

For shared IRQs unhandled IRQs may indeed be encountered. For this
reason, I set SPURIOUS_IRQ_TRIGGER to 5.

Of course, even if it misfires, we're back on track in a second.

On the other hand, setting it temporarily to a high value has the
benefit of being able to look at /proc/irq/.../spurious and see how
high level_max has gotten on a variety of machines. What would then be
a sensible number here ?

Also, FYI, here's the result of '$ cat /proc/irq/*/spurious' on the
E45M1-M PRO. IRQ45 is the AHCI handler and IRQ16  belongs to a device
behind the ASM1083. It is the Firewire chip emitting an interrupt
roughly every minute. When it misses, it is clearly seen how a new
PCIe assert/deassert message pair manages to reset the stuck line. In
this case, the system has switched 81 times in succession to polling
mode.

irq=  0 stuck_count=  0 stuck_level_max=  0
irq= 10 stuck_count=  0 stuck_level_max=  0
irq= 11 stuck_count=  0 stuck_level_max=  0
irq= 12 stuck_count=  0 stuck_level_max=  0
irq= 13 stuck_count=  0 stuck_level_max=  0
irq= 14 stuck_count=  0 stuck_level_max=  0
irq= 15 stuck_count=  0 stuck_level_max=  0
irq= 16 stuck_count= 81 stuck_level_max=  0
irq= 17 stuck_count=  0 stuck_level_max=  0
irq= 18 stuck_count=  0 stuck_level_max=  0
irq= 19 stuck_count=  0 stuck_level_max=  0
irq=  1 stuck_count=  0 stuck_level_max=  0
irq=  2 stuck_count=  0 stuck_level_max=  0
irq=  3 stuck_count=  0 stuck_level_max=  0
irq= 40 stuck_count=  0 stuck_level_max=  0
irq= 41 stuck_count=  0 stuck_level_max=  0
irq= 42 stuck_count=  0 stuck_level_max=  0
irq= 43 stuck_count=  0 stuck_level_max=  0
irq= 44 stuck_count=  0 stuck_level_max=  0
irq= 45 stuck_count=  0 stuck_level_max=  1
irq= 46 stuck_count=  0 stuck_level_max=  0
irq= 47 stuck_count=  0 stuck_level_max=  0
irq=  4 stuck_count=  0 stuck_level_max=  0
irq=  5 stuck_count=  0 stuck_level_max=  0
irq=  6 stuck_count=  0 stuck_level_max=  0
irq=  7 stuck_count=  0 stuck_level_max=  0
irq=  8 stuck_count=  0 stuck_level_max=  0
irq=  9 stuck_count=  0 stuck_level_max=  0


>> --- linux-3.2.16.orig/include/linux/irqdesc.h 2012-04-23
>> 00:31:32.000000000 +0200
>
> Your mailer wraps lines; see Documentation/email-clients.txt.

Great. I only have gmail accounts. Documentation states it won't work
with gmail. Any suggestions ?


Jeroen.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ