linux-kernel - Re: [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87h7prac67.fsf@nanos.tec.linutronix.de>
Date:   Sat, 14 Nov 2020 21:58:08 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     "Guilherme G. Piccoli" <gpiccoli@...onical.com>,
        linux-pci@...r.kernel.org, kexec@...ts.infradead.org,
        x86@...nel.org, linux-kernel@...r.kernel.org, bhelgaas@...gle.com,
        dyoung@...hat.com, bhe@...hat.com, vgoyal@...hat.com,
        mingo@...hat.com, bp@...en8.de, hpa@...or.com, andi@...stfloor.org,
        lukas@...ner.de, okaya@...nel.org, kernelfans@...il.com,
        ddstreet@...onical.com, gavin.guo@...onical.com,
        jay.vosburgh@...onical.com, kernel@...ccoli.net,
        shan.gavin@...ux.alibaba.com,
        Eric Biederman <ebiederm@...ssion.com>
Subject: Re: [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks

Bjorn,

On Sat, Nov 14 2020 at 14:39, Bjorn Helgaas wrote:
> On Sat, Nov 14, 2020 at 12:40:10AM +0100, Thomas Gleixner wrote:
>> On Sat, Nov 14 2020 at 00:31, Thomas Gleixner wrote:
>> > On Fri, Nov 13 2020 at 10:46, Bjorn Helgaas wrote:
>> >> pci_device_shutdown() still clears the Bus Master Enable bit if we're
>> >> doing a kexec and the device is in D0-D3hot, which should also disable
>> >> MSI/MSI-X.  Why doesn't this solve the problem?  Is this because the
>> >> device causing the storm was in PCI_UNKNOWN state?
>> >
>> > That's indeed a really good question.
>> 
>> So we do that on kexec, but is that true when starting a kdump kernel
>> from a kernel crash? I doubt it.
>
> Ah, right, I bet that's it, thanks.  The kdump path is basically this:
>
>   crash_kexec
>     machine_kexec
>
> while the usual kexec path is:
>
>   kernel_kexec
>     kernel_restart_prepare
>       device_shutdown
>         while (!list_empty(&devices_kset->list))
>           dev->bus->shutdown
>             pci_device_shutdown            # pci_bus_type.shutdown
>     machine_kexec
>
> So maybe we need to explore doing some or all of device_shutdown() in
> the crash_kexec() path as well as in the kernel_kexec() path.

The problem is that if the machine crashed anything you try to attempt
before starting the crash kernel is reducing the chance that the crash
kernel actually starts.

Is there something at the root bridge level which allows to tell the
underlying busses to shut up, reset or go into a defined state? That
might avoid chasing lists which might be already unreliable.

Thanks,

        tglx