linux-kernel - Re: [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <92dc24c9-0963-e894-66fe-ad74bdbc0ac3@canonical.com>
Date:   Mon, 22 Oct 2018 16:44:04 -0300
From:   "Guilherme G. Piccoli" <gpiccoli@...onical.com>
To:     Sinan Kaya <okaya@...nel.org>, linux-pci@...r.kernel.org,
        kexec@...ts.infradead.org, x86@...nel.org
Cc:     linux-kernel@...r.kernel.org, bhelgaas@...gle.com,
        dyoung@...hat.com, bhe@...hat.com, vgoyal@...hat.com,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
        andi@...stfloor.org, lukas@...ner.de, billy.olsen@...onical.com,
        cascardo@...onical.com, ddstreet@...onical.com,
        fabiomirmar@...onical.com, gavin.guo@...onical.com,
        jay.vosburgh@...onical.com, kernel@...ccoli.net, mfo@...onical.com,
        shan.gavin@...ux.alibaba.com
Subject: Re: [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot

On 18/10/2018 17:30, Sinan Kaya wrote:
> 
> AFAIK, all shutdown (not remove) routines are called before launching
> the next
> kernel even in crash scenario. It is not safe to start the new kernel while
> hardware is doing a DMA to the system memory and triggering interrupts.

Hi Sinan,

I agree with you, it's definitely not safe to start a new kernel with
in-flight DMA transactions, but in the crash scenario I think the
rationale was that running kernel is broken so it's even more unreliable
to try gracefully shutdown the devices than hope-for-the-best and start
the kdump kernel right away heheh

Fact is that the shutdown handlers are not called in the crash scenario.
They come from device_shutdown(), the code paths are as follow:

Regular kexec flow:

syscall_reboot()
  kernel_kexec()
    kernel_restart_prepare()
	  device_shutdown()
	machine_kexec()
	
Although if CONFIG_KEXEC_JUMP is set, it doesn't call device_shutdown()
either.


Crash kexec flow:
  __crash_kexec()
      machine_kexec()

There are some entry points to __crash_kexec(), like panic() or die() in
x86, for example.
To validate this, one can load a kernel with "initcall_debug" parameter,
and performs a kexec - if the shutdown handlers are called, there's a
dev_info() call that shows a message per device.


> Shutdown routine in PCI core used to disable MSI/MSI-x on behalf of all
> endpoints but it was later decided that this is the responsibility of the
> endpoint driver.
> 

This may be a good idea, using the pci layer to disable MSIs in the
quiesce path of the broken kernel. I'll follow-up this discussion in
Bjorn's reply.

Thanks,


Guilherme