lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFxd1WGNBzSHeOGiXXdUD1GqDYv9PUNGdrdiGFwaX7HYJQ@mail.gmail.com>
Date:	Thu, 19 Feb 2015 13:59:46 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Rafael David Tinoco <inaddy@...ntu.com>,
	Ingo Molnar <mingo@...nel.org>, Peter Anvin <hpa@...or.com>,
	Jiang Liu <jiang.liu@...ux.intel.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Jens Axboe <axboe@...nel.dk>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Gema Gomez <gema.gomez-solano@...onical.com>,
	Christopher Arges <chris.j.arges@...onical.com>,
	"the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: smp_call_function_single lockups

On Thu, Feb 19, 2015 at 12:29 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> Now, what happens if we send an EOI for an ExtINT interrupt? It
> basically ends up being a spurious IPI. And I *think* that what
> normally happens is absolutely nothing at all. But if in addition to
> the ExtINT, there was a pending IPI (or other pending ISR bit set),
> maybe we lose interrupts..
>
> .. and it's entirely possible that I'm just completely full of shit.
> Who is the poor bastard who has worked most with things like ExtINT,
> and can educate me? I'm adding Ingo, hpa and Jiang Liu as primary
> contacts..

So quite frankly, trying to follow all the logic from do_IRQ() through
handle_irq() to the actual low-level handler, I just couldn't do it.

So instead, I wrote a patch to verify that the ISR bit is actually set
when we do ack_APIC_irq().

This was complicated by the fact that we don't actually pass in the
vector number at all to the acking, so 99% of the patch is just doing
that. A couple of places we don't really have a good vector number, so
I said "screw it, a negative value means that we won't check the ISR).

The attached patch is quite possibly garbage, but it gives an
interesting warning for me during i8042 probing, so who knows. Maybe
it actually shows a real problem - or maybe I just screwed up the
patch.

.. and maybe even if the patch is fine, it's actually never really a
problem to have spurious APIC ACK cycles. Maybe it cannot make
interrupts be ignored.

Anyway, the back-trace for the warning I get is during boot:

    ...
    PNP: No PS/2 controller found. Probing ports directly.
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 1 at ./arch/x86/include/asm/apic.h:436
ir_ack_apic_edge+0x74/0x80()
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted
3.19.0-08857-g89d3fa45b4ad-dirty #2
    Call Trace:
     <IRQ>
       dump_stack+0x45/0x57
       warn_slowpath_common+0x80/0xc0
       warn_slowpath_null+0x15/0x20
       ir_ack_apic_edge+0x74/0x80
       handle_edge_irq+0x51/0x110
       handle_irq+0x74/0x140
       do_IRQ+0x4a/0x140
       common_interrupt+0x6a/0x6a
     <EOI>
       ? _raw_spin_unlock_irqrestore+0x9/0x10
       __setup_irq+0x239/0x5a0
       request_threaded_irq+0xc2/0x180
       i8042_probe+0x5b8/0x680
       platform_drv_probe+0x2f/0xa0
       driver_probe_device+0x8b/0x3e0
       __driver_attach+0x93/0xa0
       bus_for_each_dev+0x63/0xa0
       driver_attach+0x19/0x20
       bus_add_driver+0x178/0x250
       driver_register+0x5f/0xf0
       __platform_driver_register+0x45/0x50
       __platform_driver_probe+0x26/0xa0
       __platform_create_bundle+0xad/0xe0
       i8042_init+0x3d0/0x3f6
       do_one_initcall+0xb8/0x1d0
       kernel_init_freeable+0x16d/0x1fa
       kernel_init+0x9/0xf0
       ret_from_fork+0x7c/0xb0
    ---[ end trace 1de82c4457c6a0f0 ]---
    serio: i8042 KBD port at 0x60,0x64 irq 1
    serio: i8042 AUX port at 0x60,0x64 irq 12
    ...

and it looks not entirely insane.

Is this worth looking at? Or is it something spurious? I might have
gotten the vectors wrong, and maybe the warning is not because the ISR
bit isn't set, but because I test the wrong bit.

                         Linus

View attachment "patch.diff" of type "text/plain" (13041 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ