linux-kernel - Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git] ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFy+ER8cYV02eZsKAOLnZBWY96zNWqUFWSWT1+3sZD4XnQ@mail.gmail.com>
Date:   Sat, 25 Feb 2017 10:02:44 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     kernel test robot <fengguang.wu@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Mauro Carvalho Chehab <mchehab@...radead.org>,
        Sean Young <sean@...s.org>,
        Ruslan Ruslichenko <rruslich@...co.com>, LKP <lkp@...org>,
        "linux-input@...r.kernel.org" <linux-input@...r.kernel.org>,
        "linux-omap@...r.kernel.org" <linux-omap@...r.kernel.org>,
        kernel@...inux.com,
        Linux Media Mailing List <linux-media@...r.kernel.org>,
        linux-mediatek@...ts.infradead.org,
        linux-amlogic@...ts.infradead.org,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
        Linux LED Subsystem <linux-leds@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>, wfg@...ux.intel.com
Subject: Re: [WARNING: A/V UNSCANNABLE][Merge tag 'media/v4.11-1' of git]
 ff58d005cd: BUG: unable to handle kernel NULL pointer dereference at 0000039c

On Sat, Feb 25, 2017 at 1:07 AM, Ingo Molnar <mingo@...nel.org> wrote:
>
> So, should we revert the hw-retrigger change:
>
>   a9b4f08770b4 x86/ioapic: Restore IO-APIC irq_chip retrigger callback
>
> ... until we managed to fix CONFIG_DEBUG_SHIRQ=y? If you'd like to revert it
> upstream straight away:
>
> Acked-by: Ingo Molnar <mingo@...nel.org>

So I'm in no huge hurry to revert that commit as long as we're still
in the merge window or early -rc's.

>From a debug standpoint, the spurious early interrupts are fine, and
hopefully will help us find more broken drivers.

It's just that I'd like to revert it before the actual 4.11 release,
unless we can find a better solution.

Because it really seems like the interrupt re-trigger is entirely
bogus. It's not an _actual_ "re-trigger the interrupt that may have
gotten lost", it's some code that ends up triggering it for no good
reason.

So I'd actually hope that we could figure out why IRQS_PENDING got
set, and perhaps fix the underlying cause?

There are several things that set IRQS_PENDING, ranging from "try to
test mis-routed interrupts while irqd was working", to "prepare for
suspend losing the irq for us", to "irq auto-probing uses it on
unassigned probable irqs".

The *actual* reason to re-send, namely getting a nested irq that we
had to drop because we got a second one while still handling the first
(or because it was disabled), is just one case.

Personally, I'd suspect some left-over state from auto-probing earlier
in the boot, but I don't know. Could we fix that underlying issue?

                 Linus