lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160724051810.GA7663@roeck-us.net>
Date:	Sat, 23 Jul 2016 22:18:10 -0700
From:	Guenter Roeck <linux@...ck-us.net>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Joel Stanley <joel.stanley@....ibm.com>,
	Jeremy Kerr <jeremy.kerr@....ibm.com>,
	Greg KH <gregkh@...uxfoundation.org>
Subject: Re: kexec: device shutdown vs. remove

On Sun, Jul 24, 2016 at 06:51:52AM +1000, Benjamin Herrenschmidt wrote:
> Hi !
> 
> This is somewhat of a recurring issue, some of my previous attempts on
> lkml, I suspect, were just drowned in the noise. Eric, we had a quick
> discussion about this a while back but I don't think we reached a
> conclusion.
> 
> A bit of context: On OpenPOWER machines, we have a Linux based
> bootloader, so we rely heavily on kexec to boot distro kernels, and
> this has been causing us grief, mostly in the device driver space.
> 
> Device drivers need to be quiesced before kexec. More specifically
> the device *hardware* needs that, ie we want DMAs to stop and the
> device to be put into a state where it can reliably be picked up by the
> driver in the new kernel.
> 
> Today, kexec calls device_shutdown() to achieve that. I argue that this
> is the wrong thing to do and instead we should do someting that causes
> the various drivers ->remove() function to be called (whether that
> implies actually unbinding the driver or not).
> 
> I believe we do this for historical reasons, as ->remove() used to
> depend on CONFIG_HOTPLUG while ->shutdown() was always around but that
> is no longer the case.
> 
> The most visible issue with ->shutdown() that we encouter is that a lot
> of drivers simply don't implement it.
> 
> The *real* issue however is that it's the wrong thing to do anyway. It
> is a call intended to be called when the machine will be shutdown, as
> such not only it is very much optional (and rarely implemented), but it
> can also (and will in some cases) power bits of hardware off which is
> not what you want to do if a new driver will try to pick up the pieces.
> 
> Arguably, the most correct semantic is provided by ->remove() since
> that corresponds to removing a driver and binding a new one to the
> device. IE. the same flow as doing rmmod/insmod of a new driver.
> 
> In practice, we obseve that a lot more drivers implement ->remove(). A
> few were "fixed" to have ->shutdown() for kexec stake over time, but in
> many case it's a duplication of ->remove() (ugh...).
> 
> So I would like to discuss this or at least get feedback and an overall
> agreement. I can provide patches to test fairly soon.
> 

I suspect that using (or depending on) the remove function may not be feasible
anymore after the recent effort by Paul Gortmaker to make drivers explicitly
non-modular if they are only configurable as boolean. In many cases, this
involved dropping remove functions.

Guenter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ