lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k0um8m53.fsf@nanos.tec.linutronix.de>
Date:   Sun, 15 Nov 2020 20:18:00 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Lukas Wunner <lukas@...ner.de>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Bjorn Helgaas <helgaas@...nel.org>, linux-pci@...r.kernel.org,
        kernelfans@...il.com, andi@...stfloor.org, hpa@...or.com,
        bhe@...hat.com, x86@...nel.org, okaya@...nel.org, mingo@...hat.com,
        jay.vosburgh@...onical.com, dyoung@...hat.com,
        gavin.guo@...onical.com,
        "Guilherme G. Piccoli" <gpiccoli@...onical.com>, bp@...en8.de,
        bhelgaas@...gle.com, shan.gavin@...ux.alibaba.com,
        "Rafael J. Wysocki" <rjw@...ysocki.net>, kernel@...ccoli.net,
        kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
        ddstreet@...onical.com, vgoyal@...hat.com
Subject: Re: [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks

On Sun, Nov 15 2020 at 18:01, Lukas Wunner wrote:
> On Sun, Nov 15, 2020 at 04:11:43PM +0100, Thomas Gleixner wrote:
>> Unfortunately there is no way to tell the APIC "Mask vector X" and the
>> dump kernel does neither know which device it comes from nor does it
>> have enumerated PCI completely which would reset the device and shutup
>> the spew. Due to the interrupt storm it does not get that far.
>
> Can't we just set DisINTx, clear MSI Enable and clear MSI-X Enable
> on all active PCI devices in the crashing kernel before starting the
> crash kernel?  So that the crash kernel starts with a clean slate?
>
> Guilherme's original patches from 2018 iterate over all 256 PCI buses.
> That might impact boot time negatively.  The reason he has to do that
> is because the crashing kernel doesn't know which devices exist and
> which have interrupts enabled.  However the crashing kernel has that
> information.  It should either disable interrupts itself or pass the
> necessary information to the crashing kernel as setup_data or whatever.

As I explained before: The problem with doing anything between crashing
and starting the crash kernel is reducing the chance of actually
starting it. The kernel crashed for whatever reason, so it's in a
completely undefined state.

Ergo there is no 'just do something'. You really have to think hard
about what can be done safely with the least probability of running into
another problem.

Thanks,

        tglx


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ