linux-kernel - Re: [PATCH v1 0/3] introduce priority-based shutdown support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20231125085038.GA877872@pengutronix.de>
Date:   Sat, 25 Nov 2023 09:50:38 +0100
From:   Oleksij Rempel <o.rempel@...gutronix.de>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     Mark Brown <broonie@...nel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Ulf Hansson <ulf.hansson@...aro.org>, kernel@...gutronix.de,
        linux-kernel@...r.kernel.org, linux-mmc@...r.kernel.org,
        linux-pm@...r.kernel.org,
        Søren Andersen <san@...v.dk>
Subject: Re: [PATCH v1 0/3] introduce priority-based shutdown support

On Sat, Nov 25, 2023 at 06:51:55AM +0000, Greg Kroah-Hartman wrote:
> On Fri, Nov 24, 2023 at 07:57:25PM +0100, Oleksij Rempel wrote:
> > On Fri, Nov 24, 2023 at 05:26:30PM +0000, Greg Kroah-Hartman wrote:
> > > On Fri, Nov 24, 2023 at 05:32:34PM +0100, Oleksij Rempel wrote:
> > > > On Fri, Nov 24, 2023 at 03:56:19PM +0000, Greg Kroah-Hartman wrote:
> > > > > On Fri, Nov 24, 2023 at 03:49:46PM +0000, Mark Brown wrote:
> > > > > > On Fri, Nov 24, 2023 at 03:27:48PM +0000, Greg Kroah-Hartman wrote:
> > > > > > > On Fri, Nov 24, 2023 at 03:21:40PM +0000, Mark Brown wrote:
> > > > > > 
> > > > > > > > This came out of some discussions about trying to handle emergency power
> > > > > > > > failure notifications.
> > > > > > 
> > > > > > > I'm sorry, but I don't know what that means.  Are you saying that the
> > > > > > > kernel is now going to try to provide a hard guarantee that some devices
> > > > > > > are going to be shut down in X number of seconds when asked?  If so, why
> > > > > > > not do this in userspace?
> > > > > > 
> > > > > > No, it was initially (or when I initially saw it anyway) handling of
> > > > > > notifications from regulators that they're in trouble and we have some
> > > > > > small amount of time to do anything we might want to do about it before
> > > > > > we expire.
> > > > > 
> > > > > So we are going to guarantee a "time" in which we are going to do
> > > > > something?  Again, if that's required, why not do it in userspace using
> > > > > a RT kernel?
> > > > 
> > > > For the HW in question I have only 100ms time before power loss. By
> > > > doing it over use space some we will have even less time to react.
> > > 
> > > Why can't userspace react that fast?  Why will the kernel be somehow
> > > faster?  Speed should be the same, just get the "power is cut" signal
> > > and have userspace flush and unmount the disk before power is gone.  Why
> > > can the kernel do this any differently?
> > > 
> > > > In fact, this is not a new requirement. It exist on different flavors of
> > > > automotive Linux for about 10 years. Linux in cars should be able to
> > > > handle voltage drops for example on ignition and so on. The only new thing is
> > > > the attempt to mainline it.
> > > 
> > > But your patch is not guaranteeing anything, it's just doing a "I want
> > > this done before the other devices are handled", that's it.  There is no
> > > chance that 100ms is going to be a requirement, or that some other
> > > device type is not going to come along and demand to be ahead of your
> > > device in the list.
> > > 
> > > So you are going to have a constant fight among device types over the
> > > years, and people complaining that the kernel is now somehow going to
> > > guarantee that a device is shutdown in a set amount of time, which
> > > again, the kernel can not guarantee here.
> > > 
> > > This might work as a one-off for a specific hardware platform, which is
> > > odd, but not anything you really should be adding for anyone else to use
> > > here as your reasoning for it does not reflect what the code does.
> > 
> > I see. Good point.
> > 
> > In my case umount is not needed, there is not enough time to write down
> > the data. We should send a shutdown command to the eMMC ASAP.
> 
> If you don't care about the data, why is a shutdown command to the
> hardware needed?  What does that do that makes anything "safe" if your
> data is lost.

It prevents HW damage. In a typical automotive under-voltage labor it is
usually possible to reproduce X amount of bricked eMMCs or NANDs on Y
amount of under-voltage cycles (I do not have exact numbers right now).
Even if the numbers not so high in the labor tests (sometimes something
like one bricked device in a month of tests), the field returns are
significant enough to care about software solution for this problem.

Same problem was seen not only in automotive devices, but also in
industrial or agricultural. With other words, it is important enough to bring
some kind of solution mainline.

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |